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1.  Introduction 

In  this  report,  we  describe  work  done  at  Kestrel  under  the  DARPA  High  Perfor¬ 
mance  Knowledge  Bases  program.  The  goal  of  the  program  is  to  develop  methods  for 
structuring  large  knowledge  bases  and  reasoning  efficiently  in  them. 

The  report  contains  four  sections.  In  the  first,  we  describe  our  work  on  the  crisis 
management  challenge  problems.  In  the  second,  we  describe  our  Designware  system 
for  semi-automated  program  synthesis.  In  the  third,  we  present  a  detailed  description 
of  parametrized  specifications,  an  important  tool  for  combining  theory  refinements. 
In  the  fourth,  we  describe  limits,  interpretations,  and  slicing,  all  important  tools  for 
constructing,  deconstructing,  and  relating  theories. 

2.  Challenge  problem  experiments 

Kestrel  applied  its  work  on  the  technology  base  for  HPKB  to  address  higher  perfor¬ 
mance  aspects  of  the  program’s  goals.  Our  objective  was  to  show  how  the  tech  base 
underlying  technology  could  provide  meaningful  speedups  on  large  problems,  and  to 
show  potential  for  speedup  on  even  larger  problems.  We  are  encouraged  by  our  find¬ 
ings  so  far:  automatically-generated  solvers  produced  speedups  ranging  from  two  to 
five  orders  of  magnitude,  with  potential  applicability  also  to  large  KB  acquisition 
problems. 

SRI  graciously  suggested  challenge  questions  and  furnished  Kestrel  with  base  ax¬ 
iom  sets  and  scenario  axiom  sets,  and  helped  with  our  analyses.  We  measured  baseline 
performance  and  three  knowledge  compilation  approaches: 


Method  /  TFQ 

236b 

236c 

236d 

236e 

210a 

210b 

210c 

1.  Conventional  approach 
(answer  extraction) 

200  s 

199  s 

204  s 

181  s 

94  s 

83  s 

82  s 

2.  Knowledge  compilation 
(finite  models) 

39  ms 
(5000x) 

39  ms 
(4600x) 

2.3  ms 
(40000x) 

3.4  ms 
(24000x) 

3.2  ms 
(25000x) 

3.  Knowledge  compilation 
(schemas)  | 

310  ms 
(600x) 

330  ms 
(600x) 

310  ms 
(600x) 

310  ms 
(600x) 

All  times  were  measured  on  a  Sun  Ultrasparc  60. 


Each  table  row  shows  the  time  for  a  different  approach: 

Conventional  approach  The  conventional  approach  is  answer  extraction  via  tracing 
of  unification  substitution  (“answer  literals”),  which  SRI  used  for  the  TFQs.  This 
technique  is  due  to  Green  [16].  For  this  baseline,  we  used  SRFs  SNARK,  a  good, 
well-suited,  fast  prover. 
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Knowledge  Compilation  via  Finite  Models  Typically,  major  portions  of  a 
knowledge  base  have  a  finite  model.  For  these  portions,  the  model  can  be  pre¬ 
computed  from  the  axiom  set  and  incorporated  into  the  prover  via  procedural  at¬ 
tachment  to  predicates.  We  used  a  simple  forward  closure  to  compute  the  model  as 
a  set  of  ground  literals.  Another  point  of  view  is  that  we  used  partial  evaluation, 
holding  the  axiom  base  fixed.  Question  answering  is  then  done  via  simple  operations 
on  sets  and  relations.  These  operations  could  be  further  optimized  using  bit  vectors, 
and  we  estimate  that  another  two  orders  of  magnitude  should  be  achievable. 

Knowledge  Compilation  via  Schemata  Several  hundred  of  the  axioms  in  the 
HPKB  crisis  management  database  have  a  similar  form,  varying  only  by  the  types  of 
agents,  actions,  and  interests  involved.  Because  an  axiom  can  only  mention  specific, 
concrete  types,  we  cannot  write  a  single  axiom  that  generalizes  all  of  them;  however, 
we  can  summarize  them  in  a  table.  Then,  instead  of  using  a  prover  for  queries,  we  use 
table  lookup,  which  is  much  faster.  We  could  integrate  this  technique  into  a  general 
prover  for  use  on  suitable  subproblems. 

Knowledge  Compilation  via  Conditional  Compilation  (not  shown)  By  work¬ 
ing  with  abstract  rather  than  concrete  data,  a  prover  can  be  outfitted  to  construct 
a  program  that  computes  a  result  later,  when  the  actual  concrete  data  is  given  to 
it.  This  principle  is  alsb  used  in  partial  evaluation.  The  program  typically  includes 
conditional  branches  that  depend  on  the  actual  concrete  data  values,  hence  the  name 
“conditional  compilation”.  This  technique  is  also  due  to  Green  [16]. 

We  then  optimize  the  resulting  program  using  Kestrel’s  C  code  generator.  The 
resulting  program  is  eight  orders  of  magnitude  faster  than  standard  answer  extraction 
and  produces  answers  in  microseconds  rather  than  seconds.  Although  this  result  is 
surprising,  it  was  done  on  too  small  a  problem  (several  hundred  clauses)  to  be  anything 
but  suggestive.  Extending  this  approach  would  require  additional  research  on  problem 
solving  via  program  synthesis. 

In  our  work,  we  began  with  SRFs  answers,  and,  instead  of  trying  to  improve  them, 
we  focused  on  knowledge  compilation  to  produce  them  more  quickly.  Also,  we  have 
been  producing  answers  without  explanations,  but  we  expect  it  would  not  be  difficult 
to  generate  the  same  explanations  as  a  standard  prover. 

Scalability  We  expect  our  knowledge  compilation  techniques  to  scale,  relative  to 
“pure”  inference  provers.  That  is,  the  compilation  techniques  should  maintain  a  sig¬ 
nificant  advantage  as  question  difficulty  and  knowledge  base  size  increase.  This  is  no 
surprise;  special  problem  solvers  are  often  used  to  assist  provers.  However,  in  our  case, 
the  special  problem  solvers  were  automatically  generated. 
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We  also  believe  that  knowledge  compilation  can  lead  to  better  answers.  Although 
answering  (search)  speed  is  not  an  evaluated  property  for  HPKB,  search  speed  trans¬ 
lates  into  more  efficient  searches  and  therefore  into  larger  spaces  searched.  Larger 
searches  translate  into  ability  to  answer  harder  questions.  To  pursue  this  issue,  it  is 
possible  to  measure  the  increase  the  size  of  the  explored  space  as  a  function  of  degree 
of  compilation  for  typical  questions. 

Knowledge  acquisition  The  acquisition  of  useful,  consistent  knowledge  bases  is  a 
difficult  problem.  Fast  query  responses  help  to  discover  omissions  and  inconsistencies, 
but  more  systematic  analysis  techniques  are  needed.  Knowledge  compilation  will  al¬ 
low  more  analyses,  some  fast  enough  to  run  in  the  background  as  new  knowledge  is 
entered. 

We  have  not  yet  developed  industrial-strength  knowledge  compilation  tools,  al¬ 
though  it  appears  feasible.  Knowledge  compilation  will  be  essential  for  answering 
computationally  intensive  queries  such  as  the  network  intervention  and  repair  prob¬ 
lems  for  which  we  generated  algorithms  earlier  in  the  HPKB  program.  Large  KBs 
will  also  profit  from  the  modularization  and  composition  mechanisms  available  in 
Specware. 

2.1.  Conventional  approach 

In  this  section,  we  describe  our  first  approach  to  the  HPKB  Crisis  Mamagement 
challenge  problems.  This  work  is  unusual  because  it  combine  deductive  and  procedu¬ 
ral  reasoning  using  the  procedural  attachement  mechanism  of  the  SNARK  theorem 
prover.  In  the  competition,  the  solutions  we  obtain  to  this  class  of  problems  were 
markedly  superior  to  those  of  the  competing  team. 

2.1.1.  Escalation  and  retaliation  Parametrized  question  210  was  unusual  be¬ 
cause,  in  addition  to  ordinary  axiomatic  reasoning,  it  required  the  use  of  a  procedural 
attachment  to  get  some  of  the  effects  of  nonmonotonic  reasoning.  Instances  of  the 
question  require  us  to  determine  whether  a  particular  action  in  a  particular  context 
constitutes  an  escalation,  a  de-escalation,  or  a  retaliation. 

The  contexts  include  both  historical  incidents  and  fictional  scenarios  for  the  Crisis 
Management  Challenge  Problem  (CMCP).  We  regard  an  action  in  a  context  as  an 
escalation  if  it  is  in  response  to  another,  less  hostile  action.  Conversely,  an  action  is 
a  de-escalation  if  it  is  in  response  to  a  more  hostile  action.  Finally,  an  action  is  a 
retaliation  for  another  if  it  is  a  reaction  to  the  other  and  the  two  actions  are  opposed 
to  each  other  or  have  contrary  interests. 

The  formal  encodings  of  the  incidents  and  scenarios  contain  a  detailed  account  of 
the  agents  that  each  action  opposes,  the  interests  that  motivate  them,  and  the  causal 
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relationship  between  actions;  this  allows  us  to  deal  with  questions  about  retaliation. 

To  solve  questions  involving  escalation,  however,  it  is  necessary  to  estimate  the  level 
of  hostility  of  each  action. 

2.1.2.  Hostility  levels  Because  there  are  infinitely  many  possible  individual  ac¬ 
tions,  we  cannot  assign  hostility  levels  to  actions  individually.  Instead,  we  compute 
the  hostility  level  of  an  action  according  to  its  sort  and  other  characteristics,  because 
there  is  a  manageable  number  of  these  characteristics,  and  because  they  are  known 
in  advance. 

One  notion  of  hostility  level  was  proposed  by  Herman  Kahn  ( “On  Escalation” ) , 
who  suggested  a  forty-four-stage  linear  scale.  This  idea  was  modified  by  a  CMCP 
subject-matter  expert  during  an  interview  (“Knowledge  Acquisition  for  Crisis  Man¬ 
agement:  Interests  and  Actions,”  John  Kingston,  AIAI,  University  of  Edinburgh).  He 
suggested  a  multi-dimensional  scale  for  hostility  levels,  because  of  the  difficulty  in 
comparing  actions  of  different  kinds.  Our  own  scale  currently  has  three  components: 
damage  level,  weapon  level,  and  proximity  level. 

The  damage  level  is  an  estimate  of  what  kind  of  damage  the  action  involves.  A 
military  attack,  for  example,  is  likely  to  involve  population  damage,  which  is  the  most 
severe  level.  Public  criticism  of  one  government  by  another  is  likely  to  reach  only  the 
verbal  damage  level,  which  is  much  less  severe. 

The  weapon  level  reffects  what  sort  of  weapons  the  attack  involves.  For  example, 
an  attack  using  biological  weapons  has  wmd-level,  the  most  severe,  because  it  uses 
a  weapon  of  mass  destruction  (WMD).  An  attack  that  involves  no  weapons  has  the 
lowest  weapon  level  of  all. 

The  proximity  level  concerns  the  location  of  the  attack.  An  attack  on  the  heart  of 
another  country  has  the  most  severe  proximity  level  -  an  attack  that  is  outside  the 
borders  of  the  target  country  has  the  lowest  proximity  level. 

All  the  actions  in  the  scenarios  and  historical  narratives  are  classified  according 
to  their  sort  in  the  ontology.  For  instance,  in  the  1984-88  Tanker  War  historical 
account,  the  invasion  of  Iran  by  Iraq  is  classified  as  a  military-invasion;  the  action 
in  which  Kuwait  negotiates  with  other  countries  to  protect  its  shipping  is  classified 
as  making-2ui-agreement.  ^ 

For  certain  sorts,  we  provide  a  damage  level  and  a  weapon  level.  The  proximity 

level  of  an  action  is  determined  not  by  a  sort  but  by  the  location  of  the  attack. 

For  instance,  if  an  action  is  of  sort  terrorist-attack,  it  is  assigned 
a  population— damage— level  and  a  target— Indiscriminate— conventional— level 
weapon  level.  If  it  is  performed  in  the  capital  city  of  the  target  country, 
it  is  of  heart-proximity-level,  the  most  severe. 

Hostility  levels  of  two  actions  can  be  compared  using  a  lexicographic  ordering. 
More  precisely,  the  action  with  the  higher  damage  level  has  the  higher  hostility  level; 
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if  the  damage  levels  are  equal,  the  action  with  the  higher  weapon  level  has  the  higher 
hostility  level;  and  if  both  the  damage  levels  and  the  weapon  levels  are  equal,  the 
action  with  the  higher  proximity  level  has  the  higher  hostility  level. 

For  example,  an  attack  that  kills  population  is  regarded  as  more  severe  than  one 
that  only  damages  property,  even  if  the  first  uses  pistols  (target-specific-weapon-level) 
and  is  in  on  the  outskirts  of  the  target  country  (inside-proximity-level),  while 
the  second  uses  a  bomb  (target-indiscriminate-conventional-level)  and  is  in 
the  center  of  the  target  country  (heart-proximity-level). 

2.1.3.  Procedural  attachment  It  was  impossible  to  use  axioms  in  the  ordinary 
way  to  define  hostility  levels  of  actions,  because  of  the  nonmonotonicity  of  the  prob¬ 
lem.  It  is  quite  usual  for  an  action  to  be  of  more  than  one  sort;  for  example,  one  sort 
may  be  a  subsort  of  another,  so  an  action  of  the  first  sort  is  also  of  the  second  sort. 

If  we  use  assignment  axioms  of  the  form 

if  Taction  is  of  sort  Ai 
then  damage-level (Taction)  =  di 

and  an  action  is  of  two  sorts  Al  and  A2,  it  might  be  assigned  conflicting  damage  levels 
dl  and  d2,  leading  to  an  inconsistent  axiom  set. 

For  instance,  one  sort  in  the  ontology,  convene-task-f  orce— to-monitor-responses, 
is  assigned  a  de-escalation  damage  level,  which  is  very  low,  because  it  is  involved 
in  truce  making.  This  sort,  however,  is  classified  in  the  ontology  as  a  subsort  of  a 
larger  sort,  political  action.  Actions  of  this  sort  are  assigned  a  verbal  damage 
level,  which  is  somewhat  more  severe,  because  hostile  political  actions  such  as  threat¬ 
making  are  also  of  this  sort. 

The  situation  is  illustrated  by  this  table: 


Sort 

Damage  level 

convene-task-f orce-to-monitor-responses 

de-escalation-damage-level 

political-action 

verbal-damage-level 

At  one  stage  in  the  CMCP  Year  2  Scenario,  the  UN  Secretary  General  persuades 
Iran  and  the  GCC  to  participate  in  a  Persian  Gulf  regional  forum.  This  action  is  classi¬ 
fied  in  the  scenario  description  as  being  of  sort  convene-task-f  orce-to-monitor-responses. 
Hence,  the  above  assignment  axiom  can  be  applied  to  assign  the  action  a  de-escalation 
damage  level.  Since  the  agreement  of  Iran  and  the  GCC  is  also  a  political  action,  the 
assignment  axiom  could  be  applied  as  well  to  assign  it  a  verbal  damage  level,  contra¬ 
dicting  the  previous  assignment. 

We  would  prefer  to  compute  the  damage  level  of  an  action  by  using  the  smallest 
subsort  that  includes  the  action  and  that  has  an  assigned  damage  level,  since  the 
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smaller  subsorts  give  more  specific  information.  If  a  subsort  has  no  assigned  damage 
level,  we  use  the  damage  level  of  larger  subsorts. 

Such  an  operation  is  not  monotonic:  finding  out  new  information,  such  as  an 
assignment  of  a  damage  level  to  a  sort,  could  result  in  an  inference  becoming  invalid. 
For  example,  if  an  inference  depended  on  the  fact  that  the  agreement  has  a  verbal 
damage  level,  and  then  we  discover  later  that  convening  a  task  force  has  a  lower 
damage  level,  the  earlier  inference  will  no  longer  be  valid. 

Axiomatic  reasoning  can  perform  only  monotonic  inferences,  which  does  not  allow 
the  retraction  of  any  conclusion  as  a  result  of  discovering  new  evidence.  Therefore, 
instead  of  using  axiomatic  reasoning,  we  invoke  a  procedural  attachment  mechanism 
to  compute  damage  and  weapon  levels. 

Procedural  attachment  is  a  SNARK  feature  that  enables  us  to  associate  selected 
predicate  and  function  symbols  in  SNARK  with  Lisp  functions  that  evaluate  them. 
This  allows  us  to  circumvent  axiomatic  reasoning  when  we  have  a  procedural  way  of 
determining  the  value  of  the  symbol.  In  this  case,  we  have  procedural  attachments 
for  the  damage-level  and  weapon-level  function  symbols.  In  computing  the  level 
for  an  action,  the  attachment  looks  at  the  sort(s)  of  that  action.  The  damage  and 
weapon  levels  of  the  smallest  sort  with  an  assigned  level  is  assigned  as  the  damage 
and  weapon  levels  of  the  action.  If  there  is  more  than  one  such  smallest  sort,  the 
highest  of  the  levels  is  selected. 

For  example,  in  determining  the  damage  level  of  the  action  in  which  Iran  and  the 
GCC  make  the  agreement,  the  function  will  assign  the  de-escalation  damage  level, 
which  is  associated  with  the  sort  convene-task-f  orce-to-monitor-responses.  The 
verbal  damage  level,  which  is  associated  with  the  sort  political-action,  is  ig¬ 
nored,  because  convene-task-f  orce-to-monitor-responses  is  a  proper  subsort  of 
political-action  -  it  is  smaller. 

2.1.4.  Axiomatic  Reasoning  Although  procedural  attachment  is  used  to  com¬ 
pute  hostility  levels,  ordinary  axiomatic  reasoning  is  used  to  determine  whether  an 
action  in  a  narrative  is  to  be  regarded  as  an  escalation,  a  de-escalation,  or  a  retalia¬ 
tion.  For  instance,  one  axiom  states  that  an  action  is  an  escalation  if  it  is  caused  by 
another  action  and  is  of  a  greater  hostility  level  than  the  earlier  action.  More  precisely, 
the  escalation  axiom  is 

(assertion 

«= 

(escalation  ?action2  ?context) 

(and 

(occurs-in  ?action2  ?context) 

(cause-event-event*  ?actionl  ?action2) 
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(greater-hostility  ?action2  ?actionl))) 
iname  escalation-if-in-response-to-less-hostile-action 
; documentation  "An  ?action2  is  cin  escalation  in  ?context 
if  it  is  in  response  to  an  ?actionl  and  is  of  greater  hostility 
level  than  ?actionl.") 

A  similar  axiom  for  de-escalation  is 

(assertion 

«= 

(de-escalation  ?action2  ?context) 

(and 

(occurs-in  ?actionl  ?context) 

(cause-event-event*  ?actionl  ?action2) 

(greater-hostility  ?actionl  ?action2))) 

:name  de-escalation-if-in-response-to-more-hostile-action 
: documentation  "An  ?action2  is  a  de-escalation  in  ?context 
if  it  is  in  response  to  an  ?actionl  that  has  a  greater 
hostility  level") 

There  are  other  axioms  for  escalation  and  de-escalation  too.  For  instance,  an 
action  whose  damage  level  is  de-escalation  (for  example,  a  ceasefire)  is  automatically 
regarded  as  a  de-escalation. 

An  action  is  a  retaliation  for  another  action  if  it  is  caused  by  that  action  and  the 
two  actions  are  opposed  to  each  others  agents  or  are  motivated  by  opposing  interests. 
For  instance,  if  ?agentl  performs  an  action  that  damages  the  interests  of  ?agent2, 
and  in  response  ?agent2  performs  an  action  that  damages  the  interests  of  ?agentl, 
the  latter  action  is  presumed  to  be  a  retaliation.  More  precisely,  we  have  the  retaliation 
axiom 

(assertion 

«= 

(retaliation  ?action2  ?actionl  ?context) 

(and 

(occurs-in  ?action2  ?context) 

(performed-by  ?action2  ?agent2) 

(performed-by  ?actionl  ?agentl) 

(cause-event-event*  ?actionl  ?action2) 

(or 

(and  (opposing  ?actionl  ?agent2) 

(opposing  ?action2  ?agentl)) 
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(and  (opposes-interest  ?actionl  ?interestl) 

(supports-interest  ?action2  ?interestl)) 

(and  (supports-interest  ?actionl  ?interest2) 

(opposes-interest  ?action2  ?interest2))))) 
rname  retaliation-is-in-response-to-earlier-action 

: documentation  "A  retaliation  is  an  action  caused  by  an  earlier  action, 
such  that  the  actions  are  opposed  to  each  others  agents  or 
the  two  actions  have  opposing  interests.") 

2.1.5.  Example  Let  us  see  how  these  techniques  applied  to  a  sample  Challenge 
Problem.  In  the  final  evaluation,  question  TFQ210c  asks:  “In  the  1984-8  Tanker  War, 
is  the  event  in  which  Iraq  accede  to  request  of  UN  during  20  July  1987  a  de-escalation 
of  confiict?”  Note  that  the  question  is  ungrammatical  because  it  is  generated  auto¬ 
matically  from  a  template. 

This  question  is  formalized  as  follows: 

(and  (occurs-in  ?action2  1984-1988-tanker-war) 

(instance-of  ?action2  accept) 

(performed-by  ?action2  iraq) 

(action-involves  ?action2  ?agreement) 

(action-enabled-by  ?action2  Tactionl) 

(performed-by  ?actionl  united-nations) 

(instance-of  ?actionl  making-an-agreement) 

(temporal-bounds-contain 

(day-fn  20  (month-fn  July  (year-fn  1987)))  ?actionl) 

(de-escalation  ?action2  1984-1988-tanker-war)) 

Most  of  the  formalization  simply  describes  the  event,  paraphrasing  the  question.  This 
serves  to  bind  to  ?action2  the  event  under  discussion.  The  key  component  of  the 
formalization  is  the  final  conjunct, 

(de-escalation  ?action2  1984- 1988-tanker-war) , 

which  asks  if  the  action  is  a  de-escalation. 

By  the  axiom  for  de-escalation  we  gave  earlier,  the  system  knows  that  an  action 
is  a  de-escalation  if  it  is  in  response  to  an  earlier  action  of  greater  hostility.  In  the 
formal  description  of  the  Tanker  War,  it  discovers  an  axiom 

(contributing-f actor 

iran-directly-challenges-ships-in-us-convoys 

iraq-accepts-un-resolution-598) 
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In  other  words,  a  contributing  factor  in  Iraq’s  acceptance  of  the  UN  resolution  is  the 
fact  that  Iran  has  directly  challenged  ships  in  a  US  convoy. 

An  axiom  in  the  knowledge  base  tells  us  that  if  one  action  is  a  contributing  factor 
in  another,  it  is  a  cause  of  the  other  action: 

(assertion 

(<= 

(cause-event-event*  ?eventl  ?event2) 

(contributing-f actor  ?eventl  ?event2)) 

:name  contributing-f actor-implies-causes-indirectly 
: documentation  "If  an  event  is  a  contributing  factor  for  another, 
it  also  causes  it  indirectly"  ) 

Therefore,  Iran’s  challenge  of  the  US  convoy  is  a  cause  of  Iraq’s  acceptance  of  the  UN 
resolution. 

Acceptance  of  the  UN  resolution  is  declared  in  the  formalization  of  the  Tanker 
War  to  be  of  sort  making-an-agreement,  which  has  a  verbal  damage  level,  which  is 
relatively  low.  On  the  other  hand,  Iran’s  challenge  of  the  US  convoy  is  declared  to  be  of 
sort  threatening-action,  which  is  declared  to  have  damage  level  non- combat -act ion, 
Since  this  damage  level  is  higher  than  that  of  a  verbal  action,  the  hostility  level  of 
Iran’s  challenge  is  higher  than  that  of  Iraq’s  acceptance  of  the  UN  resolution.  This 
is  because  the  ordering  on  hostility  levels  has  been  asserted  to  be  a  lexicographic 
ordering  on  its  components. 

In  short,  Iraq’s  acceptance  of  the  UN  resolution  has  been  found  to  be  in  response 
to  another  action  of  higher  hostility.  Hence,  by  the  de-escalation  axiom,  it  is  a  de- 
escalation. 

2.2.  Finite  models 

In  this  approach,  we  observe  that  major  portions  of  a  KB  have  a  finite  model,  and 
that  we  can  reason  more  quickly  in  this  model  than  in  the  KB  itself. 

To  build  the  model,  we  compute  the  transitive  closure  of  the  axioms,  and  attach  to 
each  predicate  the  set  of  its  instances.  Then,  to  answer  a  query,  we  find  the  instances 
satisfying  each  literal  and  combine  them  using  intersection  and  union. 

Although  computing  the  transitive  closure  took  some  time,  the  resulting  ground 
unit  KB  was  quite  manageable  in  size.  Query  answering  took  less  than  a  second, 
and  aggressive  optimization  probably  could  reduce  the  response  time  to  under  one 
millisecond. 

This  technique  works  directly  from  a  given  KB,  with  little  or  no  human  interven¬ 
tion.  Although  preprocessing  is  time-consuming,  much  of  this  work  can  be  avoided  if 
the  original  KB  is  organized  in  tables  of  ground  data. 
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2.3.  Parametric  queries 

For  parametric  queries,  such  as  PQ210,  we  found  that  axioms  and  explanations  came 
in  symmetry  groups.  A  general  query  asked  once  and  for  all  could  be  reused  by 
instantiation  to  all  members  in  a  given  symmetry  group.  For  PQ210,  we  were  able  to 
group  over  200  axioms,  thus  representing  about  1000  different  explanations  within  a 
single  scheme. 

Structuring  a  set  of  axioms  in  this  way  helped  to  identify  missing  axioms.  Under 
the  sort  assumption  that  a  terrorist  group  is  an  instance  of  a  criminal  organization 
we  found  that  our  added  axioms  could  answre  all  queries  under  TQ210. 

Knowledge  is  captured  using  generic  templates  that  are  summarized  in  tables 
instead  of  enumerating  very  similiar  axioms  in  a  verbose  mode. 

For  the  first,  and  largest  template 

(prove-perf ormed-meta-theorem) 

proves  the  generic  theorem  that  justifies  retrieving  answers  by  a  simple  table  lookup 
instead  of  quering  the  knowledge  base. 

(evaluate-interest-typically-underlies  agent  action) 

performs  this  lookup,  whereas 

(ask- interest -typically-underlies  agent  action) 

does  the  corresponding  theorem  proving  directly.  However,  given  the  meta-proof, 
we  obtain  an  explanation  for  each  instance  by  simple  substitution  of  the  parameter 
agent  and  action  sorts  into  the  explanation  (which  we  cannot  access  yet  as  we  need  a 
compatible  version  of  OKBC  and  other  SRI/SAIC  installed  layers  on  top  of  SNARK 
and  LISP). 

The  timing  effect  on  precompiling  a  generic  query  answer  function  for  the  template 
class  of  axioms  which  constituted  about  90  percent  of  the  available  data  is  dramatic. 
To  answer,  for  instance,  what  the  interests  that  typically  underly  a  country  to  conduct 
a  peace  keeping  mission  SNARK  generates  the  last  answer  instance  at  the  7,101 
derived  clause: 

;  Summary  of  computation: 

;  7,101  formulas  have  been  input  or  derived  (from  5,454  formulas). 

;  6,609  (93*/.)  were  retained.  Of  these, 

;  793  (12*/,)  were  simplified  or  subsumed  later, 

;  0  (  0*/.)  were  deleted  later  because  the  agenda  was  full,  and 

;  5,816  (88*/.)  are  still  being  kept. 
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Run  time  in  seconds  excluding  printing  time: 


1.5 

27. 

Resolution 

0.5 

07, 

Paramodulation 

14.0 

147. 

Forward  subsumption 

2.1 

2% 

Backward  subsumption 

2.2 

27. 

Forward  simplification 

0.7 

17. 

Backward  simplification 

58.5 

607. 

Sorts 

9.9 

107. 

Kif  input 

8.8 

97. 

Other 

98.2 

Total 

The  alternative  available  answering  mechanism  that  uses  a  simple  table  lookup  is  in 
contrast  instantaneous. 

From  a  meta  theorem,  we  can  generate  the  table  by  extracting  all  model  axioms 
as  answers  and  have  the  query  answering  mechanism  dispatch  on  the  sorts  in  the 
table  entries.  The  table  can  naturaly  also  be  generated  a  priori,  because  the  axioms 
reflect  the  nature  of  the  table. 

2.4.  Conditional  formation 

In  this  approach,  we  convert  axioms  into  conditional  expressions,  then  use  these 
expressions  to  create  a  highly  optimized  C  program.  For  example,  if  we  begin  with 
the  axiom 

nation(x)  iff  x  =  USA  or  x  =  England  or  ... 

we  generate  this  program 

boolecin  nation  (object  x)  { 

(x  ==  USA)  I  I 
(x  ==  England)  I  I  ... 

} 

Using  this  method,  we  obtained  dramatic  speedups  (answers  in  microseconds),  but  we 
only  tried  small  examples.  Still,  potentially  enormous  improvements  in  performance 
are  plausible. 
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2.5.  Temporal  reasoner 

Reasoning  about  events  in  time  is  pervasive  in  the  HPKB  knowledge  base.  The  tem¬ 
poral  logic  covered  by  the  Allen  temporal  relations  extended  with  date  reasoning  has 
been  adequate  to  capture  most  of  the  relevant  properties. 

In  part  of  Kestrel’s  collaboration  with  Cycorp  we  developed  a  temporal  reasoner 
based  on  the  Allen  temporal  relations  in  Cycorp’s  knowledge  base  programming  lan¬ 
guage  /iLISP.  The  temporal  reasoner  maintains  a  data  base  of  relations  between  time 
intervals.  Intervals  I  and  J  can  be  related  in  the  following  ways: 

1.  before (/,  J):  /  is  strictly  before  J. 

2.  meets (/,  J):  /  ends  where  J  starts. 

3.  join(/,  J):  I  starts  strictly  before  J  and  ends  inside  J. 

4.  ??(/,  J):  I  starts  strictly  before  J  and  ends  where  J  ends. 

5.  ??(/,  J):  I  starts  where  J  starts  and  ends  strictly  before  J  ends. 

6.  equal (/,  J):  intervals  I  and  J  are  the  same. 

together  with  the  symmetric  versions,  for  example  met-by(/,  J)  •<-)•  meet(J,  I). 

Since  all  the  possible  relations  between  two  closed  intervals  are  enumerated  by  this 
base  set  of  relations  it  is  directly  closed  under  negation.  The  negation  of  a  temporal 
relation  Ri  is  the  disjunction  of  the  other  relations  \/ ^  Ri.  The  temporal  reasoner  uses  a 
7  by  7  table  to  compute  the  transitive  closure  of  relations.  For  example,  if  before  (/,  J) 
holds  and  join(J,  AT),  then  bef ore(/.  A").  On  the  other  hand,  if  meets(/,  J)  and 
meets(J,A'),  then  either  meets(/,A:)  or  before(/, //).  The  table  summarizing  all 
Allen  temporal  relationships  is  summarized  in  [3].  However,  it  contains  a  typo,  a 
disjunction  in  one  of  the  table  entries  is  missing,  so  to  avoid  repeating  the  same  typo 
and  introducing  new  we  synthesized  the  table  using  decision  procedures  for  rational 
numbers.  For  this  purpose  we  translated  the  Allen  temporal  primitives  to  arithmetical 
relations  using  the  schema: 

before  (/,  J)  =  End(/)(Start(J) 

meets  (/,  J)  =  End(/)  =  Start(  J) 

starts (/,  J)  =  Start (/)  =  Start (J)&:End(/)(End(J) 

Then,  for  I,  J,K  we  formed  all  triples  of  the  form 

Ri{I,J)  A  Rj{J,K)  A  Rk{I,K) 

for  temporal  relations  Ri,  Rj,  Rk ,  and  included  Rk  in  the  Ri ,  Rj  entry  if  and  only  if 
the  arithmetical  predicate  was  satisfiable  (the  fact  that  we  used  a  decision  procedure 
for  rational  arithmetic  was  not  exploited  fully,  as  the  decision  problems  only  required 
reasoning  about  a  linear  order). 

Using  this  approach  we  generated  the  relevant  table  for  the  Allen  temporal  rea¬ 
soner  and  delivered  it  to  Cycorp. 


15 


2.5.1.  Table  generation  with  Specware  In  this  paragraph  we  outline  how  the 
table  was  generated  using  Specware’s  user  syntax. 

spec  IntervalRelations  = 
sort  interval 

op  StEirt  :  interval  ->  Nat 
op  End  :  interval  ->  Nat 

axiom  f a (i : interval)  (Start (i)  <=  End(i)) 

def  Before (I, J)  =  End (I)  <  Start (J) 
def  After(I,J)  =  Before(J,I) 
def  During(I,J)  = 

Start (J)  <  Start (I)  &  End (I)  <  End(J) 
def  Contains (I , J)  =  During(y,I) 
def  Overlaps (I , J)  = 

Staxt(I)  <  Start (J)  & 

Start (J)  <  End(I)  & 

End (I)  <  End(J) 

def  OverlappedBy (I , J)  =  Overlaps (y, I) 
def  Meets (I, J)  =  End (I)  =  Start (J) 
def  MetBy(I,J)  =  Meets (y, I) 
def  SteirtsCl,!)  = 

Start(I)  =  Start(J)  &  End(I)  <  End(J) 
def  StartedBy (I , J)  = 

Starts (y ,1) 
def  Finishes (I, J)  = 

End (I)  =  End(J)  &  Start (J)  <  Start (I) 
def  FinishedBy (I , J)  =  Finishes (J, I) 

conjecture  *tr-lt*-l  is 

not (trStart (a,b)  &  trStart(b,c)  &  trStart(a,c)) 

conjecture  *tr-fc*“49  is 

not (trFinshedBy (a,b)  &  trFinishedByCb, c)  &  trFinishedBy(a,c)) 
end-spec 


Curiously,  we  later  found  that  the  above  mentioned  bug  in  the  article  had  pen¬ 
etrated  to  other  implementations  of  the  Allen  temporal  reasoner,  which  ended  up 
giving  undesired  behaviour  of  the  underlying  theorem  prover. 

3.  Designware 

This  section  presents  a  mechanizable  framework  for  software  development  by  refine¬ 
ment.  The  framework  is  based  on  a  category  of  higher-order  specifications.  The  key 
idea  is  representing  knowledge  about  programming  concepts,  such  as  algorithm  de¬ 
sign,  datatype  refinement,  and  expression  simplification,  by  means  of  taxonomies  of 
specifications  and  morphisms. 
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The  framework  is  partially  implemented  in  the  research  systems  Specware,  De- 
signware,  and  Planware.  Specware  provides  basic  support  for  composing  specifications 
and  refinements  via  colimit,  and  for  generating  code  via  logic  morphisms.  Specware  is 
intended  to  be  general-purpose  and  has  found  use  in  industrial  settings.  Designware 
extends  Specware  with  taxonomies  of  software  design  theories  and  support  for  con¬ 
structing  refinements  from  them.  Planware  builds  on  Designware  to  provide  highly 
automated  support  for  requirements  acquisition  and  synthesis  of  high-performance 
scheduling  algorithms. 

3.1.  Overview 

A  software  system  can  be  viewed  as  a  composition  of  information  from  a  variety  of 
sources,  including 

—  the  application  domain, 

—  the  requirements  on  the  system’s  behavior, 

—  software  design  knowledge  abont  system  architectures,  algorithms,  data  structures, 
code  optimization  techniqnes,  and 

—  the  run-time  hardware/software/physical  environment  in  which  the  software  will 
execute. 

This  section  presents  a  mechanizable  framework  for  representing  these  various  sources 
of  information,  and  for  composing  them  in  the  context  of  a  refinement  process.  The 
framework  is  founded  on  a  category  of  specifications.  Morphisms  are  used  to  structure 
and  parameterize  specifications,  and  to  refine  them.  Colimits  are  used  to  compose 
specifications.  Diagrams  are  used  to  express  the  structure  of  large  specifications,  the 
refinement  of  specifications  to  code,  and  the  application  of  design  knowledge  to  a 
specification. 

The  framework  features  a  collection  of  techniques  for  constructing  refinements 
based  on  formal  representations  of  programming  knowledge.  Abstract  algorithmic 
concepts,  datatype  refinements,  program  optimization  rules,  software  architectures, 
abstract  user  interfaces,  and  so  on,  are  represented  as  diagrams  of  specifications  and 
morphisms.  We  arrange  these  diagrams  into  taxonomies,  which  allow  incremental 
access  to  and  construction  of  refinements  for  particular  requirement  specifications. 
For  example,  a  user  may  specify  a  scheduling  problem  and  select  a  theory  of  global 
search  algorithms  from  an  algorithm  library.  The  global  search  theory  is  used  to 
construct  a  refinement  of  the  scheduling  problem  specification  into  a  specification 
containing  a  global  search  algorithm  for  the  particular  scheduling  problem. 

The  framework  is  partially  implemented  in  the  research  systems  Specware,  De¬ 
signware,  and  Planware.  Specware  provides  basic  support  for  composing  specifications 
and  refinements,  and  generating  code.  Code  generation  in  Specware  is  supported  by 
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inter-logic  morphisms  that  translate  between  the  specification  language/logic  and  the 
logic  of  a  particular  programming  language  (e.g.  CommonLisp  or  Cd-f).  Specware 
is  intended  to  be  general-purpose  and  has  found  use  in  industrial  settings.  Design- 
ware  extends  Specware  with  taxonomies  of  software  design  theories  and  support  for 
constructing  refinements  from  them.  Planware  provides  highly  automated  support  for 
requirements  acquisition  and  synthesis  of  high-performance  scheduling  algorithms. 

The  remainder  of  this  section  covers  basic  concepts  and  the  key  ideas  of  our  ap¬ 
proach  to  software  development  by  refinement,  in  particular  the  concept  of  design 
by  classification  [26].  We  also  discuss  the  application  of  these  techniques  to  domain- 
specific  refinement  in  Planware  [5].  A  detailed  presentation  of  a  derivation  in  Design- 
ware  is  given  in  [27]. 

3.2.  Basic  Concepts 

3.2.1.  Specifications  A  specification  is  the  finite  presentation  of  a  theory.  The 
signature  of  a  specification  provides  the  vocabulary  for  describing  objects,  operations, 
and  properties  in  some  domain  of  interest,  and  the  axioms  constrain  the  meaning  of 
the  symbols.  The  theory  of  the  domain  is  the  closure  of  the  axioms  under  the  rules 
of  inference. 

Example:  Here  is  a  specification  for  partial  orders,  using  notation  adapted  from 
Specware.  It  introduces  a  sort  E  and  an  infix  binary  predicate  on  E,  called  le,  which 
is  constrained  by  the  usual  axioms.  Although  Specware  allows  higher-order  specifica¬ 
tions,  first-order  formulations  are  sufficient  for  most  purposes. 

spec  Partial-Order  is 
sort  E 

op  Je_  :  E,E  Boolean 
axiom  reflexivity  is  x  le  x 
axiom  transitivity  is  x  ley  A  y  le  z  x  le  z 
axiom  antisymmetry  is  x  ley  A  y  le  x  x  =  z 
end-spec 

Example:  Containers  are  constructed  by  a  binary  join  operator  and  they  represent 
finite  collections  of  elements  of  some  sort  E.  The  specification  shown  in  Figure  1 
includes  a  definition  by  means  of  axioms.  Operators  are  required  to  be  total.  The 
constructor  clause  asserts  that  the  operators  {empty,  singleton,  join}  construct  the 
sort  Container,  providing  the  basis  for  induction  on  Container. 

The  generic  term  expression  will  be  used  to  refer  to  a  term,  formula,  or  sentence. 

A  model  of  a  specification  is  a  structure  of  sets  and  total  functions  that  satisfy  the 
axioms.  However,  for  software  development  purposes  we  have  a  less  well-defined  notion 
of  semantics  in  mind:  each  specification  denotes  a  set  of  possible  implementations  in 
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spec  Container  is 

sorts  £?,  Container 

op  empty  :—y  Container 

op  singleton  :  E  -4  Container 

op  ^join-  :  Container,  Container  ^  Container 

constructors  {empty ,  singleton,  join}  construct  Container 

axiom  V(x  :  Container){x  join  empty  =  x  A  empty  join  x  =  x) 
op  Jn_  :  E,  Container  Boolean 
definition  of  in  is 

axiom  x  in  empty  =  false 
axiom  x  in  singleton{y)  =  (a:  =  y) 
axiom  x  in  U  join  V  =  (x  in  U  V  x  in  V) 
end-definition 
end-spec 


Fig.  1.  Specification  for  Containers 


some  computational  model.  Currently  we  regard  these  as  functional  programs.  A 
denotational  semantics  maps  these  into  classical  models. 

3.2.2.  Morphisms  A  specification  morphism  translates  the  language  of  one  spec¬ 
ification  into  the  language  of  another  specification,  preserving  the  property  of  prov¬ 
ability,  so  that  any  theorem  in  the  source  specification  remains  a  theorem  under 
translation. 

A  specification  morphism  m  :  T  — >  T'  is  given  by  a  map  from  the  sort  and 
operator  symbols  of  the  domain  spec  T  to  the  symbols  of  the  codomain  spec  T'.  To 
be  a  specification  morphism  it  is  also  required  that  every  axiom  of  T  translates  to  a 
theorem  of  T'.  It  then  follows  that  a  specification  morphism  translates  theorems  of 
the  domain  specification  to  theorems  of  the  codomain. 

Example:  A  specification  morphism  from  Partial- Order  to  Integer  is: 

morphism  Partial-Order-to-Integer  is 
{E  i->  Integer,  le  •->  <} 

Translation  of  an  expression  by  a  morphism  is  by  straightforward  application  of  the 
symbol  map,  so,  for  example,  the  Partial-Order  axiom  x  le  x  translates  to  a;  <  x. 
The  three  axioms  of  a  partial  order  remain  provable  in  Integer  theovy  after  translation. 

Morphisms  come  in  a  variety  of  flavors;  here  we  only  use  two.  An  extension  or 
import  is  an  inclusion  between  specs. 

Example:  We  can  build  up  the  theory  of  partial  orders  by  importing  the  theory  of 
preorders.  The  import  morphism  is  {E  E,  le  le}. 

spec  PreOrder 
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sort  E 

op  Je_  :  E,E  Boolean 
axiom  reflexivity  is  x  lex 

axiom  transitivity  is  x  ley  A  y  le  z  x  le  z 
end-spec 

spec  Partial-Order 
import  PreOrder 

axiom  antisymmetry  is  x  ley  A  y  le  x  x  =  z 
end-spec 

A  definitional  extension,  written  A  -d-^B  ,  is  an  import  morphism  in  which  any 
new  symbol  in  B  also  has  an  axiom  that  defines  it.  Definitions  have  implicit  axioms 
for  existence  and  uniqueness.  Semantically,  a  definitional  extension  has  the  property 
that  each  model  of  the  domain  has  a  unique  expansion  to  a  model  of  the  codomain. 

Example:  Container  can  be  formulated  as  a  definitional  extension  of 
Pre-Container: 

spec  Pre-Container  is 
sorts  E,  Container 
op  empty  :  —>■  Container 
y  op  singleton  :  E  — >  Container 

op  .join-  :  Container,  Container  — >  Container 
constructors  {empty,  singleton,  join}  constinct  Container 
axiom  V(x  :  Container){x  join  empty  =  x  A  empty  join  x  =  x) 
end-spec 

spec  Container  is 

imports  Pre-Container 
definition  of  in  is 

axiom  x  in  empty  =  false 
axiom  x  in  singleton{y)  =  {x  =  y) 
axiom  x  inU  join  V  =  {x  in  U  V  x  inV) 
end-definition 
end-spec 

A  parameterized  specification  can  be  treated  syntactically  as  a  morphism. 
Example:  The  specification  Container  can  be  parameterized  on  a  spec  Triv  with 
a  single  sort: 

spec  Triv  is 
sort  E 
end-spec 
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via 


parameterized-spec  Parameterized-Container  :  TRIV  Container  is 
{E  E} 

A  functorial  semantics  for  first-order  parameterized  specifications  via  coherent 
functors  is  given  in  section  4.. 

3.2.3.  The  Category  of  Specs  Specification  morphisms  compose  in  a  straight¬ 
forward  way  as  the  composition  of  finite  maps.  It  is  easily  checked  that  specifications 
and  specification  morphisms  form  a  category  SPEC.  Colimits  exist  in  SPEC  and  are 

easily  computed.  Suppose  that  we  want  to  compute  the  colimit  of  B^C:—  A  —^C  . 
First,  form  the  disjoint  union  of  all  sort  and  operator  symbols  of  A,  B,  and  C,  then 
define  an  equivalence  relation  on  those  symbols: 

s^t  iff  (i(s)  =  t  y  i{t)  =  s  y  j{s)  —ty  j{t)  =  s). 

The  signature  of  the  colimit  (also  known  as  pushout  in  this  case)  is  the  collection  of 
equivalence  classes  wrt  Ri.  The  cocone  morphisms  take  each  symbol  into  its  equiva¬ 
lence  class.  The  axioms  of  the  colimit  are  obtained  by  translating  and  collecting  each 
axiom  of  A,  B,  and  C. 

Example:  Suppose  that  we  want  to  build  up  the  theory  of  partial  orders  by  com¬ 
posing  simpler  theories. 


The  pushout  of  Antisymmetry  <—  BinRel  PreOrder  is  isomorphic  to  the 
specification  for  Partial-Order  in  Section  2.1.  In  detail:  the  morphisms  are  {E  i— > 
E,  le  le}  from  BinRel  to  both  PreOrder  and  Antisymmetry.  The  equivalence 
classes  are  then  {{E,E,E},  {le,le,le}},  so  the  colimit  spec  has  one  sort  (which 
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we  rename  E),  and  one  operator  (which  we  rename  le).  Furthermore,  the  axioms  of 
BinRel,  Antisymmetry,  and  PreOrder  are  each  translated  to  become  the  axioms  of 
the  colimit.  Thus  we  have  Partial- Order. 

Example:  The  pushout  operation  is  also  used  to  instantiate  the  parameter  in  a 
parameterized  specification  [6].  The  binding  of  argument  to  parameter  is  represented 
by  a  morphism.  To  form  a  specification  for  Containers  of  integers,  we  compute  the 
pushout  of  Container  Triv  Integer,  where  Container  <—  Triv  is  {E  E}, 
and  Triv  — >  Integer  is  {E  Integer}. 

Example:  A  specification  for  sequences  can  be  built  up  from  Container,  also  via 
pushouts.  We  can  regard  Container  as  parameterized  on  a  binary  operator 

spec  BinOp  is 
sort  E 

op  Jbop-  :  E,E  E 
end-spec 

morphism  Container- Parameterization  :  BinOp  Container  is 
{E  1-^  E,  bop  join} 

and  we  can  define  a  refinement  arrow  that  extends  a  binary  operator  to  a  semigroup: 

spec  Associativity  is 
"import  BinOp 

axiom  Associativity  is  {{x  join  y)  join  z)  =  {x  join  {y  join  z)) 
end-spec 

The  pushout  of  Associativity  -f-  BinOp  Container,  produces  a  collection  speci¬ 
fication  with  an  associative  join  operator,  which  is  Proto-Seq,  the  core  of  a  sequence 
theory  (See  Appendix  in  [27]).  By  further  extending  Proto-Seq  with  a  commutativity 
axiom,  we  obtain  Proto-Bag  theory,  the  core  of  a  bag  (multiset)  theory. 

3.2.4.  Diagrams  Roughly,  a  diagram  is  a  graph  morphism  to  a  category,  usually 
the  category  of  specifications  in  our  work.  For  example,  the  pushout  described  above 
started  with  a  diagram  comprised  of  two  arrows: 

BinRel - PreOrder 


Antisymmetry 
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and  computing  the  pushout  of  that  diagram  produces  another  diagram: 

BinRel - ^  PreOrder 

Antisymmetry - >  Partial-Order 

A  diagram  commutes  if  the  composition  of  arrows  along  two  paths  with  the  same 
start  and  finish  node  yields  equal  arrows. 

3.2.5.  The  Structuring  of  Specifications  Colimits  can  be  used  to  construct  a 
large  specification  from  a  diagram  of  specs  and  morphisms.  The  morphisms  express 
various  relationships  between  specifications,  including  sharing  of  structure,  inclusion 
of  structure,  and  parametric  structure.  Several  examples  will  appear  later. 

Example:  The  finest-grain  way  to  compose  Partial- Order  is  via  the  colimit  of 


BinRel 


Reflexivity  Transitivity  Antisymmetry 

3.2.6.  Refinement  and  Diagrams  As  described  above,  specification  morphisms 
can  be  used  to  help  structure  a  specification,  but  they  can  also  be  used  to  refine 
a  specification.  When  a  morphism  is  used  as  a  refinement,  the  intended  effect  is  to 
reduce  the  number  of  possible  implementations  when  passing  from  the  domain  spec 
to  the  codomain.  In  this  sense,  a  refinement  can  be  viewed  as  embodying  a  particular 
design  decision  or  property  that  corresponds  to  the  subset  of  possible  implementations 
of  the  domain  spec  which  are  also  possible  implementations  of  the  codomain. 

Often  in  software  refinement  we  want  to  preserve  and  extend  the  structure  of  a 
structured  specification  (versus  flattening  it  out  via  colimit).  When  a  specification  is 
structured  as  a  diagram,  then  the  corresponding  notion  of  structured  refinement  is  a 
diagram  morphism.  A  diagram  morphism  M  from  diagram  D  to  diagram  E  consists 
of  a  set  of  specification  morphisms,  one  from  each  node/spec  in  D  to  a  node  in  E 
such  that  certain  squares  commute  (a  functor  underlies  each  diagram  and  a  natural 
transformation  underlies  each  diagram  morphism).  We  use  the  notation  D  — >  E 
for  diagram  morphisms. 

Example:  A  datatype  refinement  that  refines  bags  to  sequences  can  be  presented 
as  the  diagram  morphism  BtoS  :  BAG  BAG-AS-SEQ: 
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where  the  domain  and  codomain  of  BtoS  are  shown  in  boxes,  and  the  (one)  square 
commutes.  Here  Bag-as-Seq  is  a  definitional  extension  of  that  provides  an  image 
for  Bag  theory.  Specs  for  Bag^  Seq  and  Bag-as-Seq  and  details  of  the  refinement  can 
be  found  in  Appendix  A  of  [27].  The  interesting  content  is  in  spec  morphism  BtoS  Bag- 

morphism  BtoSsag  '■  Bag  Bag-as-Seq  is 


{Bag 

1-4 

Bag-as-Seq, 

empty-bag 

H4 

bag-empty, 

empty-bag? 

1-4 

bag-empty?. 

nonempty? 

1-4 

bag-nonempty?. 

singleton-bag 

1-4 

bag- singleton. 

singleton-bag? 

H4 

bag- singleton? , 

nonsingleton-bag? 

1-4 

bag-nonsingleton? , 

in 

1-4 

bag-in. 

bag-union 

1-4 

bag-union. 

bag-wfgt 

1-4 

bag-wfgt. 

size 

1-4 

bag-size} 

Diagram  morphisms  compose  in  a  straightforward  way  based  on  spec  morphism 
composition.  It  is  easily  checked  that  diagrams  and  diagram  morphisms  form  a  cate¬ 
gory.  Colimits  in  this  category  can  be  computed  using  left  Kan  extensions  and  colimits 
in  SPEC.  In  the  sequel  we  will  generally  use  the  term  refinement  to  mean  a  diagram 
morphism. 

3.2.7.  Logic  Morphisms  and  Code  Generation  Inter-logic  morphisms  [24]  are 
used  to  translate  specifications  from  the  specification  logic  to  the  logic  of  a  program¬ 
ming  language.  See  [28]  for  more  details.  They  are  also  useful  for  translating  between 
the  specification  logic  and  the  logic  supported  by  various  theorem-provers  and  anal¬ 
ysis  tools.  They  are  also  useful  for  translating  between  the  theory  libraries  of  various 
systems. 
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3.3.  Software  Development  by  Refinement 


S2 

\  f 


The  development  of  correct-by-construction  code  via  a  for¬ 
mal  refinement  process  is  shown  to  the  left.  The  refinement 
process  starts  with  a  specification  of  the  requirements 
on  a  desired  software  artifact.  Each  Si^  i  =  0,1,  ...,n  rep¬ 
resents  a  structured  specification  (diagram)  and  the  arrows 
JJ.  are  refinements  (represented  as  diagram  morphisms).  The 
refinement  from  Si  to  embodies  a  design  decision  which 
cuts  down  the  number  of  possible  implementations.  Finally  an 
inter-logic  morphism  translates  a  low-level  specification  Sn  to 
code  in  a  programming  language.  Semantically  the  effect  is  to 
narrow  down  the  set  of  possible  implementations  of  Sn  to  just 
one,  so  specification  refinement  can  be  viewed  as  a  construc¬ 
tive  process  for  proving  the  existence  of  an  implementation 
of  specification  (and  proving  its  consistency). 


Code 

Clearly,  two  key  issues  in  supporting  software  development  by  refinement  are:  (1) 
how  to  construct  specifications,  and  (2)  how  to  construct  refinements.  Most  of  the 
sequel  treats  mechanizable  techniques  for  constructing  refinements. 


3.3.1.  Constructing  Specifications  A  specification-based  development  environ¬ 
ment  supplies  tools  for  creating  new  specifications  and  morphisms,  for  structuring 
specs  into  diagrams,  and  for  composing  specifications  via  importation,  parameteriza¬ 
tion,  and  colimit.  In  addition,  a  software  development  environment  needs  to  support 
a  large  library  of  reusable  specifications,  typically  including  specs  for  (1)  common 
datatypes,  such  as  integer,  sequences,  finite  sets,  etc.  and  (2)  common  mathematical 
structures,  such  as  partial  orders,  monoids,  vector  spaces,  etc.  In  addition  to  these 
generic  operations  and  libraries,  the  system  may  support  specialized  construction 
tools  and  libraries  of  domain-specific  theories,  such  as  resource  theories,  or  generic 
theories  about  domains  such  as  satellite  control  or  transportation. 


3.3.2.  Constructing  Refinements  A  refinement-based  development  environment 
supplies  tools  for  creating  new  refinements.  One  of  our  innovations  is  showing  how 
a  library  of  abstract  refinements  can  be  applied  to  produce  refinements  for  a  given 
specification.  In  this  section,  we  focus  mainly  on  refinements  that  embody  design 
knowledge  about  (1)  algorithm  design,  (2)  datatype  refinement,  and  (3)  expression  op¬ 
timization.  We  believe  that  other  types  of  design  knowledge  can  be  similarly  expressed 
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and  exploited,  including  interface  design,  software  architectures,  domain-specific  re¬ 
quirements  capture,  and  others.  In  addition  to  these  generic  operations  and  libraries, 
the  system  may  support  specialized  construction  tools  and  libraries  of  domain-specific 
refinements. 


The  key  concept  of  this  work  is  the  following:  abstract  design  knowledge  about 
datatype  refinement,  algorithm  design,  software  architectures,  program  optimization 
rules,  visualization  displays,  and  so  on,  can  be  expressed  as  refinements  (i.e.  diagram 
morphisms) .  The  domain  of  one  such  refinement  represents  the  abstract  structure  that 
is  required  in  a  user’s  specification  in  order  to  apply  the  embodied  design  knowledge. 
The  refinement  itself  embodies  a  design  constraint  -  the  effect  is  a  reduction  in  the  set 
of  possible  implementations.  The  codomain  of  the  refinement  contains  new  structures 
and  definitions  that  are  composed  with  the  user’s  requirement  specification. 


The  figure  to  the  left  shows  the  application  of  a  library  re¬ 
finement  A  =>  J5  to  a  given  (structured)  specification  So- 
First  the  library  refinement  is  selected.  The  applicability  of  the 
refinement  to  Sq  is  shown  by  constructing  a  classification  ar¬ 
row  from  A  to  which  classifies  So  as  having  ^-structure  by 
making  explicit  how  So  has  at  least  the  structure  of  A.  Finally 
the  refinement  is  applied  by  computing  the  pushout  in  the  ca¬ 


tegory  of  diagrams.  The  creative  work  lies  in  constructing  the 


classification  arrow  [25,26]. 


3.4.  Scaling  up 

The  process  of  refining  specification  S'o  described  above  has  three  basic  steps: 

1.  select  a  refinement  A  B  from  a  library, 

2.  construct  a  classification  arrow  A  So,  and 

3.  compute  the  pushout  of  B  A  So- 

The  resulting  refinement  is  the  cocone  arrow  S'o  Si-  This  basic  refinement  process 

is  repeated  until  the  relevant  sorts  and  operators  of  the  spec  have  sufficiently  explicit 
definitions  that  they  can  be  easily  translated  to  a  programming  language,  and  then 
compiled. 

In  this  section  we  address  the  issue  of  how  this  basic  process  can  be  further 
developed  in  order  to  scale  up  as  the  size  and  complexity  of  the  library  of  specs  and 
refinements  grows.  The  first  key  idea  is  to  organize  libraries  of  specs  and  refinements 
into  taxonomies-  The  second  key  idea  is  to  support  tactics  at  two  levels:  theory-specific 
tactics  for  constructing  classification  arrows,  and  task-specific  tactics  that  compose 
common  sequences  of  the  basic  refinement  process  into  a  larger  refinement  step. 


3.4.1.  Design  by  Classification:  Taxonomies  of  Refinements  A  productive 
software  development  environment  will  have  a  large  library  of  reusable  refinements, 
letting  the  user  (or  a  tactic)  select  refinements  and  decide  where  to  apply  them.  The 
need  arises  for  a  way  to  organize  such  a  library,  to  support  access,  and  to  support 
efficient  construction  of  classification  arrows.  A  library  of  refinements  can  be  organized 
into  taxonomies  where  refinements  are  indexed  on  the  nodes  of  the  taxonomies,  and 
the  nodes  include  the  domains  of  various  refinements  in  the  library.  The  taxonomic 
links  are  refinements,  indicating  how  one  refinement  applies  in  a  stronger  setting  than 
another. 


Container 


Proto-Seq 


Proto-Bag  Seq 


Fig.  2.  Taxonomy  of  Container  Datatypes 


Figure  2  sketches  a  taxonomy  of  abstract  datatypes  for  collections.  The  arrows 
between  nodes  express  the  refinement  relationship;  e.g.  the  morphism  from  Proto-Seq 
to  Proto-Bag  is  an  extension  with  the  axiom  of  commutativity  applied  to  the  join 
constructor  of  Proto-Seqs.  Datatype  refinements  are  indexed  by  the  specifications  in 
the  taxonomy;  e.g.  a  refinement  from  (finite)  bags  to  (finite)  sequences  is  indexed  at 
the  node  specifying  (finite)  bag  theory. 

The  paper  [27]  gives  a  taxonomy  of  algorithm  design  theories.  The  refinements 
indexed  at  each  node  correspond  to  (families  of)  program  schemes.  The  algorithm 
theory  associated  with  a  scheme  is  sufficient  to  prove  the  consistency  of  any  instance 
of  the  scheme.  Nodes  that  are  deeper  in  a  taxonomy  correspond  to  specifications 
that  have  more  structure  than  those  at  shallower  levels.  Generally,  we  wish  to  select 
refinements  that  are  indexed  as  deeply  in  the  taxonomy  as  possible,  since  the  maximal 
amount  of  structure  in  the  requirement  specification  will  be  exploited.  In  the  algorithm 
taxonomy,  the  deeper  the  node,  the  more  structure  that  can  be  exploited  in  the 
problem,  and  the  more  problem-solving  power  that  can  be  brought  to  bear.  Roughly 
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speaking,  narrowly  scoped  but  faster  algorithms  are  deeper  in  the  taxonomy,  whereas 
widely  applicable  general  algorithms  are  at  shallower  nodes. 

Two  problems  arise  in  using  a  library  of  refinements:  (1)  selecting  an  appropri¬ 
ate  refinement,  and  (2)  constructing  a  classification  arrow.  If  we  organize  a  library 
of  refinements  into  a  taxonomy,  then  the  following  ladder  construction  process  pro¬ 
vides  incremental  access  to  applicable  refinements,  and  simultaneously,  incremental 
construction  of  classification  arrows. 


The  process  of  incrementally  constructing  a  refinement 
is  illustrated  in  the  ladder  construction  diagram  to  the 
left.  The  left  side  of  the  ladder  is  a  path  in  a  taxon¬ 
omy  starting  at  the  root.  The  ladder  is  constructed  a 
rung  at  a  time  from  the  top  down.  The  initial  interpre¬ 
tation  from  Aq  to  SpecQ  is  often  simple  to  construct. 
The  rungs  of  the  ladder  are  constructed  by  a  constraint 
solving  process  that  involves  user  choices,  the  propa¬ 
gation  of  consistency  constraints,  calculation  of  colim¬ 
its,  and  constructive  theorem  proving  [25,26].  Gener¬ 
ally,  the  rung  construction  is  stronger  than  a  colimit 
-  even  though  a  cocone  is  being  constructed.  The  in¬ 
tent  in  contructing  li  :  Ai  =^SpeCi  is  that  Speq,  has 
sufficient  defined  symbols  to  serve  as  the  codomain.  In 
other  words,  the  implicitly  defined  symbols  in  Ai  are 
translated  to  explicitly  defined  symbols  in  Speci. 


Once  we  have  constructed  a  classification  arrow  An 
Specn  and  selected  a  refinement  An  Bn  that  is 

An  - . ”  >  SpeCn  indexed  at  node  An  in  the  taxonomy,  then  constructing 

a  refinement  of  Spcco  is  straightforward;  compute  the 
pushout,  yielding  SpeCn+i,  then  compose  arrows  down 
''  SpeCn+i  ^^he  right  side  of  the  ladder  and  the  pushout  square  to 
obtain  Speco  ==>  SpeCn+i  as  the  final  constructed 
refinement. 


Again,  rung  construction  is  not  simply  a  matter  of  computing  a  colimit.  For  ex¬ 
ample,  there  are  at  least  two  distinct  arrows  from  Divide-and-Conquer  to  Sorting, 
corresponding  to  a  mergesort  and  a  quicksort  -  these  are  distinct  cocones  and  there 
is  no  universal  sorting  algorithm  corresponding  to  the  colimit.  However,  applying  the 
refinement  that  we  select  at  a  node  in  the  taxonomy  is  a  simple  matter  of  computing 
the  pushout.  For  algorithm  design  the  pushout  simply  instantiates  some  definition 
schemes  and  other  axiom  schemes. 
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It  is  unlikely  that  a  general  automated  method  exists  for  constructing  rungs  of 
the  ladder,  since  it  is  here  that  creative  decisions  can  be  made.  For  general-purpose 
design  it  seems  that  users  must  be  involved  in  guiding  the  rung  construction  process. 
However  in  domain-specific  settings  and  under  certain  conditions  it  will  possible  to 
automate  rung  construction  (as  discussed  in  the  next  section) .  Our  goal  in  Designware 
is  to  build  an  interface  providing  the  user  with  various  general  automated  operations 
and  libraries  of  standard  components.  The  user  applies  various  operators  with  the 
goal  of  filling  out  partial  morphisms  and  specifications  until  the  rung  is  complete. 
After  each  user-directed  operation,  constraint  propagation  rules  are  automatically 
invoked  to  perform  sound  extensions  to  the  partial  morphisms  and  specifications 
in  the  rung  diagram.  Constructive  theorem-proving  provides  the  basis  for  several 
important  techniques  for  constructing  classification  arrows  [25,26]. 

3.4.2.  Tactics  The  design  process  described  so  far  uses  primitive  operations  such 
as  (1)  selecting  a  spec  or  refinement  from  a  library,  (2)  computing  the  pushout/colimit 
of  (a  diagram  of)  diagram  morphisms,  and  (3)  unskolemizing  and  translating  a  for¬ 
mula  along  a  morphism,  (4)  witness-finding  to  derive  symbol  translations  during  the 
construction  of  classification  arrows,  and  so  on.  These  and  other  operations  can  be 
made  accessible  through  a  GUI,  but  inevitably,  users  will  notice  certain  patterns  of 
such  operations  arising,  and  will  wish  to  have  macros  or  parameterized  procedures 
for  them,  which  we  call  tactics.  They  provide  higher  level  (semiautomatic)  operations 
for  the  user. 

The  need  for  at  least  two  kinds  of  tactics  can  be  discerned. 

1.  Classification  tactics  control  operations  for  constructing  classification  arrows.  The 
divide-and-conquer  theory  admits  at  least  two  common  tactics  for  constructing 
a  classification  arrow.  One  tactic  can  be  procedurally  described  as  follows:  (1) 
the  user  selects  a  operator  symbol  with  a  DRO  requirement  spec,  (2)  the  system 
analyzes  the  spec  to  obtain  the  translations  of  the  DRO  symbols,  (3)  the  user  is 
prompted  to  supply  a  standard  set  of  constructors  on  the  input  domain  D,  (4)  the 
tactic  performs  unskolemization  on  the  composition  relation  in  each  Soundness 
axiom  to  derive  a  translations  for  Oa,  and  so  on.  This  tactic  was  followed  in  the 
mergesort  derivation. 

The  other  tactic  is  similar  except  that  the  tactic  selects  constructors  for  the  com¬ 
position  relations  on  R  (versus  D)  in  step  (3),  and  then  uses  unskolemization 
to  solve  for  decomposition  relations  in  step  (4).  This  tactic  was  followed  in  the 
quicksort  derivation. 

A  classification  tactic  for  context-dependent  simplification  provides  another  exam¬ 
ple.  Procedurally:  (1)  user  selects  an  expression  expr  to  simplify,  (2)  type  analysis 
is  used  to  infer  translations  for  the  input  and  output  sorts  of  expr,  (3)  a  context 
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analysis  routine  is  called  to  obtain  contextual  properties  of  expr  (yielding  the 
translation  for  C),  (4)  unskolemization  and  witness-finding  are  used  to  derive  a 
translation  for  new-expr. 

2.  Refinement  tactics  control  the  application  of  a  collection  of  refinements;  they  may 
compose  a  common  sequence  of  refinements  into  a  larger  refinement  step.  Plan- 
ware  has  a  code-generation  tactic  for  automatically  applying  spec-to-code  inter¬ 
logic  morphisms.  Another  example  is  a  refinement  tactic  for  context-dependent 
simplification;  procedurally,  (1)  use  the  classification  tactic  to  construct  the  classi¬ 
fication  arrow,  (2)  compute  the  pushout,  (3)  apply  a  substitution  operation  on  the 
spec  to  replace  expr  with  its  simplified  form  and  to  create  an  isomorphism.  Finite 
Differencing  requires  a  more  complex  tactic  that  applies  the  tactic  for  context- 
dependent  simplification  repeatedly  in  order  to  make  incremental  the  expressions 
set  up  by  applying  the  Expression-and- Function  — >  Abstracted-Op  refinement. 

We  can  also  envision  the  possibility  of  metatactics  that  can  construct  tactics  for 
a  given  class  of  tasks.  For  example,  given  an  algorithm  theory,  there  may  be  ways 
to  analyze  the  sorts,  ops  and  axioms  to  determine  various  orders  in  constructing  the 
translations  of  classification  arrows.  The  two  tactics  for  divide-and-conquer  mentioned 
above  are  an  example. 

3.5.  Summary 

The  main  message  of  this  section  is  that  a  formal  software  refinement  process  can  be 
supported  by  automated  tools,  and  in  particular  that  libraries  of  design  knowledge 
can  be  brought  to  bear  in  constructing  refinements  for  a  given  requirement  specifica¬ 
tion.  One  goal  of  this  section  has  been  to  show  that  diagram  morphisms  are  adequate 
to  capture  design  knowledge  about  algorithms,  data  structures,  and  expression  opti¬ 
mization  techniques,  as  well  as  the  refinement  process  itself.  We  showed  how  to  apply 
a  library  refinement  to  a  requirement  specification  by  constructing  a  classification 
arrow  and  computing  the  pushout.  We  discussed  how  a  library  of  refinements  can 
be  organized  into  taxonomies  and  presented  techniques  for  constructing  classification 
arrows  incrementally.  The  examples  and  most  concepts  described  are  working  in  the 
Specware,  Designware,  and  Planware  systems. 

Acknowledgements:  The  work  reported  here  is  the  result  of  extended  collaboration 
with  our  colleagues  at  Kestrel  Institute.  We  would  particularly  like  to  acknowledge  the 
contributions  of  Li  Mei  Gilham,  Junbo  Liu,  Dusko  Pavlovic,  and  Stephen  Westfold. 

4.  Parametrized  specifications 

Parametricity  is  one  of  the  most  effective  ways  to  achieve  compositionality  and  reuse 
in  software  development.  Parametric  specifications  have  been  thoroughly  analyzed  in 
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the  algebraic  setting  and  are  by  now  a  standard  part  of  most  software  development 
toolkits.  However,  an  effort  towards  classifying,  specifying  and  refining  algorithmic 
theories,  rather  than  mere  datatypes,  quickly  leads  beyond  the  realm  of  algebra,  and 
often  to  full  first  order  theories.  We  extend  the  standard  semantics  of  parametric 
specifications  to  this  more  general  setting. 

The  familiar  semantic  characterization  of  parametricity  in  the  algebraic  case  is 
expressed  in  terms  of  the  free  functor,  i.e.  using  the  initial  models.  In  the  general 
case,  initial  models  may  not  exist,  and  the  free  functor  is  not  available.  Various 
syntactic,  semantic,  and  abstract  definitions  of  parametricity  have  been  offered,  but 
their  exact  relationships  are  often  unclear.  Using  the  methods  of  categorical  model 
theory,  we  establish  the  equivalence  of  two  well  known,  yet  so  far  unrelated,  definitions 
of  parametricity,  one  syntactic,  one  semantic.  Besides  providing  support  for  both 
underlying  views,  and  a  way  for  aligning  the  systems  based  on  each  of  them,  the 
offered  general  analysis  and  its  formalism  open  several  avenues  for  future  research 
and  applications. 

4.1.  Introduction 

4.1.1.  Parametric  specifications  The  idea  of  parametric  polymorphism  goes  back 
to  Strachey  [30]  and  refers  to  code  reusable  over  any  type  that  may  be  passed  to  it 
as  a  parameter.  If  a  type  is  viewed  as  a  set  of  logical  invariants  of  the  data,  this 
idea  naturally  extends  to  the  software  specifications,  as  the  logical  theories  capturing 
requirements  and  allowing  their  refinement.  The  idea  of  parametric  specifications  was 
proposed  early  on  and  became  a  standard  part  of  specification  theory  (cf.  e.g.  [9, 12, 
13]  and  the  references  therein). 

A  standard  nontrivial  example  of  a  parametric  specification  is  a  presentation  of 
the  theory  of  vector  spaces,  with  the  theory  of  fields  as  its  parameter.  The  idea  is 
that  refining  the  parameter,  in  this  case  the  subtheory  referring  to  scalars,  yields  a 
consistent  refinement  of  the  larger  theory,  usually  called  the  body.  Formally,  we  are 
given  a  theory  VecSp  and  a  distinguished  subtheory  Field  VecSp.  The  refinement 
is  realized  by  the  pushout  in  the  category  of  specifications  [6, 14]. 

Field*^ - ^  VecSp[Field] 


Real‘s - ^  VecSp[Real] 

The  functoriality  of  the  pushout  operation  ensures  the  compositionality  of  the  refine¬ 
ments. 
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Of  course,  not  every  interpretation  of  one  specification  in  another  allows  this.  For 
instance,  if  instead  of  Field,  just  the  theory  of  rings  is  taken  as  the  parameter  of  VecSp, 
some  consistent  refinements  of  the  parameter  will  induce  inconsistencies  in  the  body. 
Some  models  of  the  parameter  therefore  do  not  correspond  to  models  of  the  body. 

Some  syntactic  parametricity  conditions,  ensuring  that  consistent  refinements  of 
the  parameter  induce  consistent  refinements  of  the  body,  were  proposed  early  on 
[11,15].  However,  the  analogous  semantic  characterizations,  ensuring  that  models  of 
the  parameter  induce  models  of  the  body,  were  given  only  in  terms  of  free  functors, 
which  only  exist  for  (essentially)  algebraic  specifications,  i.e.  those  stated  using  just 
operations  and  equations  (and  simple  implications  between  them).  In  [11],  cofree 
functors  were  analyzed  as  well,  but  for  a  general  first  order  theory,  they  may  not  exist 
either.  E.g.,  the  theories  of  fields,  Hilbert  spaces,  or  linear  orders  do  not  have  either 
intial  or  final  models. 

Algebraic  specifications  do  suffice  for  great  many  practical  tasks  and  offer  a  fruit¬ 
ful  ground  for  theory  [9].  However,  when  it  comes  to  systems  for  code  synthesis,  like 
Specware™  [17],  where  it  is  essential  to  compositionally  refine  and  implement  not 
only  abstract  datatypes,  but  also  abstract  algorithmic  theories,  algebraic  specifica¬ 
tions  become  increasingly  insufficient,  and  initial  and  final  semantics  do  not  apply. 

On  one  hand,  a  syntactic  form  of  parametricity  for  general  specifications  has  been 
used  in  practice  and  in  the  literature  [12, 13].  On  the  other  hand,  in  [8],  a  semantical 
definition  of  parametricity  was  proposed,  independent  of  the  existence  of  initial  or 
final  models.  However,  it  seems  that  neither  the  semantic  characterization  of  the  for¬ 
mer  nor  the  syntactic  characterization  of  the  latter  have  been  worked  out.  Abstracting 
away  from  the  concrete  meaning  of  parametricity,  some  interesting  structures  have 
been  built,  applicable  to  parametric  specifications  in  general  [7,29],  yet  no  statement 
tieing  together  the  syntactic  and  the  semantic  intuitions  seems  to  have  been  proved. 
The  purpose  of  the  present  paper  is  to  try  to  bridge  this  gap,  while  providing  some  ev¬ 
idence  of  the  applicability  of  categorical  model  theory  to  the  study  of  general  software 
specifications.^ 

4.1.2.  Elements  of  categorical  model  theory  The  functorial  semantics  of  alge¬ 
braic  theories  goes  back  to  the  sixties,  to  Lawvere’s  thesis  [19].  The  theory  of  categor¬ 
ical  universal  algebra  which  arose  from  it  is  summarized  in  [23].  An  important  step 
beyond  algebra  is  the  study  of  locally  presentable  categories  [10],  which  come  about 
as  the  model  categories  of  limit  theories,  a  wider,  yet  essentially  restricted  class.  The 
full  scope  of  first  order  logic  was  covered  by  categorical  model  theory  rather  slowly, 

^  In  contrast,  the  purported  algebraicizing  of  general  specifications  in  higher  order  logic  by  presenting  the 
first  order  theorems  as  higher  order  equations  only  shifts  the  problems  from  the  large  but  familiar  area  of 
first  order  model  theory  to  the  scarcely  cultivated  field  of  higher  order  algebra. 
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throughout  the  seventies  and  eighties,  as  some  parts  tend  to  be  technically  rather 
demanding.  Good  accounts  of  the  more  accessible  parts  are  [2, 21, 22]. 

The  main  idea  of  functorial  semantics  is  to 

-  present  logical  theories  as  classifying  categories  with  structure,  so  as  to 

—  obtain  their  models  as  structure  preserving  functors  to  Set,  with  homomorphisms 

between  them  as  natural  transformations. 

The  resulting  categories  of  models  will  always  be  accessible,  i.e.  have  directed  colimits 
and  a  suitable  generating  set.  Conversely,  every  accessible  category  can  be  obtained 
as  the  category  of  models  of  a  first  order  theory,  possibly  infinitary.  Categorical  model 
theory  is  thus  the  study  of  accessible  categories  and  the  way  they  arise  from  theories. 
There  is  a,  very  general  Stone-type  duality  between  the  first  order  theories,  presented 
as  categories,  and  the  induced  categories  of  models  [20],  but  it  is  quite  involved  in 
technical  details,  and  it  is  not  clear  whether  it  can  be  brought  into  a  practically  useful 
form. 

But  without  going  into  the  formal  duality,  one  can  still  systematically  explore  the 
relationships  between  the  syntactic  and  the  semantic  aspects  of  theories,  by  analyzing 
functors  between  their  categorical  presentations.  In  particular,  for  any  two  first  order 
theories  A  and  B,  presented  as  classifying  categories,  one  can  align  the  properties 
of  the  logical  interpretations,  which  can  be  captured  as  functors  F  :  A  — >•  B,  and 
the  induced  forgetful,  or  “reduct”,  functors  F*  :  Mod(B)  — Mod(A)  between  the 
corresponding  categories  of  models. 

This  is  a  typical  task  for  semantics  of  software  specifications:  analyze  how  a  par¬ 
ticular  class  of  syntactical  manipulations  with  theories  is  reflected  on  their  models, 
and  on  the  computations  that  may  be  built  on  top.  We  shall  show  that  a  syntactic 
definition  of  parametric  specification,  viewed  as  a  property  of  the  interpretation  func¬ 
tor  F  :  A  — >  B,  is  equivalent  to  an  independent  semantic  definition,  stated  in  terms 
of  the  “reduct”  functor  F^  :  Mod(B)  — Mod(A). 

4.1.3.  Outline  of  the  paper  In  the  next  section,  we  describe  the  concrete  con¬ 
structions  of  classifying  categories,  explain  how  interpretations  are  captured  as  func¬ 
tors  between  them,  and  how  the  idea  of  parametricity  fits  into  this  setting. 

In  section  4.3.  we  list  some  abstract  preliminary  results  that  align  the  syntactic 
and  semantic  properties  of  functors. 

Finally,  in  section  4.4.,  we  derive  the  main  result:  the  equivalence  of  a  syntactic 
form  of  parametricity,  in  the  spirit  of  [12,13],  and  a  semantic  form,  as  in  [8],  both 
adapted  only  to  a  common  categorical  setting. 
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4.2.  Theories  and  models,  categorically 

4.2.1.  Classifying  categories  The  simplest  classifying  category  is  the  Lawvere 
done  C7-  of  an  algebraic  theory  T,  say  single  sorted.  Its  objects  can  be  viewed  as 
natural  numbers  {viz  the  arities),  while  a  morphism  from  m  to  n  is  an  n-tuple  of 
the  elements  of  the  free  algebra  in  m  generators,  i.e.  a  function  n  — Tm,  where 
T  denotes  the  free  algebra  constructor.^  A  crucial  observation  from  Lawvere’s  thesis 
[19]  is  that  C7-  classifies  T-algebras,  in  the  sense  that  they  exactly  correspond  to  the 
product  preserving  functors  C7-  — )■  Set,  while  the  T-homomorphisms  correspond 
to  the  natural  transformations  between  them.  Indeed,  since  n  in  C7-  appears  as  the 

product  of  n  copies  of  1,  the  product  preservation  ensures  that  the  functors  C7 - >  Set 

trace  the  operations  with  the  correct  arities.  The  equations  of  T  are  then  enforced 
by  the  functoriality.  Detailed  explanations  of  the  functorial  semantics  of  algebraic 
theories  can  be  found  e.g.  in  [23]. 

If  models  of  more  general  theories  are  to  be  captured  as  functors,  some  additional 
preservation  properties  will  be  needed,  in  order  to  enforce  satisfaction  of  not  necessar¬ 
ily  equational  formulas,  that  may  express  more  than  mere  commutativity  conditions. 
There  are  several  well  known  frameworks  for  building  suitable  classifying  categories 
and  developing  functorial  semantics  for  general  first  order  theories,  the  most  “cat¬ 
egorical”  being  probably  sketches  [4,21].  We  shall  however  work  in  the  setting  of 
coherent  categories  [22],  closest  to  the  original  geometric  spirit  of  categorical  logic, 
because  they  seem  to  allow  the  quickest  and  perhaps  the  most  intuitive  approach  to 
the  matters  presently  of  interest. 

4.2.2.  Coherent  categories  Let  T  be  a  multisorted  first  order  theory  with  equal¬ 
ity.  For  simplicity,  we  assume  that  it  is  purely  relational:  operations  are  captured  by 
their  graphs.  Moreover,  T  is  assumed  to  be  generated  by  a  set  of  axioms  in  coherent 
logic,  i.e.  using  finitary  A  and  V,  including  the  empty  ones,  T  and  T,  and  the  quan¬ 
tifier  3.  The  underlying  logic  can  be  classical  or  intuitionistic.  We  cannot  go  into  the 
details  here,  but  reducing  finitary  first  order  logic  to  its  coherent  fragment  is  a  fairly 
standard  technical  device  (see  [2, 1, 22]  and  especially  the  informative  introduction  of 
[21]).  The  extension  to  infinitary  logic  is  justified  by  stable  and  natural  categories  of 
models  and  is  routinely  handled  by  extending  the  classifying  constructions.  However, 
some  of  the  proofs  presented  below  essentially  depend  on  the  finiteness  assumption. 

Formally,  the  theory  T  can  be  viewed  as  a  preorder:  the  underlying  set  |T|  of 
well-formed  formulas  is  generated  by  its  language,  while  the  entailment  preorder  h  is 
generated  by  its  axioms.  The  rough  idea  is  to  capture  the  well-formed  formulas  of  T 

^  So  if  T  is  presented  by  the  monad  T,  the  classifier  Cr  is  the  dual  of  the  induced  Kleisli  category,  restricted 
to  natural  numbers. 
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as  the  objects  of  the  classifying  category  C7-,  and  the  theorems  of  T  as  the  morphisms 
of  Cr- 

The  passage  from  the  formulas  of  T  to  the  objects  of  Cr  requires  an  adjustment: 
the  formulas  must  be  viewed  modulo  variable  renaming,  i.e.  a-conversion  ^(r)  rsj 
where  x  and  y  are  vectors  of  variables.  Note  that  this  is  not  a  congruence  with  respect 
to  the  logical  operations,  because  e.g.  (j){x)  A  ^{y)  7^  (j){x)  A  (j){x). 

The  passage  from  theorems  of  T  to  morphisms  of  Cr  requires  a  similar  adjustment: 
modulo  the  logical  equivalence  ip  Hh  ■0,  which  means  that  (p  \-  ip  and  ip  \-  p.  The 
definition  is  thus 


|Cr|  =  |r|/~ 

Cr  ioi{x),  /3{y))  =  {15(2;,  y)  eT\  i9(x,  y)  h  a{x)  A  yS(y), 

a{x)  h  3y.  'd{x,y), 

i^{x,  y')  A  d{x,  y")  h  y'  =  y") /  Hh 

where  x  and  y  are  disjoint  strings  of  variables,  always  available  by  renaming^,  and 
=  is  the  equality  predicate  in  T.  The  identities  in  C/-  are  induced  by  the  equality 
predicates,  and  the  composition  of  i9{x,y)  and  Q{y,z)  is  3y.  'd{x,y)  A  Q{y,z). 

The  logical  structure  of  T  induces  the  categorical  structure  of  Cr: 

—  finite  limits  are  constructed  using  the  conjunction  and  the  variable  tupling,  start¬ 
ing  from  the  true  predicates  T(2;)  over  each  sort; 

—  regular  epi-mono  factorisations  are  constructed  using  the  existential  quantifier; 
and  finally 

—  joins  of  the  subobjects  correspond  to  the  disjunctions. 

These  three  structural  components  constitute  a  coherent  category  and  are  preserved 
by  coherent  functors.  Theories  in  coherent  logic  generate  coherent  classifying  cate¬ 
gories;  conversely,  each  small  coherent  category  classifies  a  coherent  theory.  Coherent 
functors  preserve  the  truth  of  the  theorems  in  coherent  logic.  The  reader  may  wish 
to  work  out  the  details  of  this  correspondence  or  to  consult  some  of  the  mentioned 
references. 

A  reader  familiar  with  the  functorial  semantics  of  algebra  has  perhaps  already 
noticed  that  the  coherent  classifier  of  an  algebraic  theory  contains  the  corresponding 
Lawvere  clone  as  a  full  subcategory,  namely  the  one  spanned  by  the  true  formulas 
T(a;),  one  for  each  arity  x.  Indeed,  the  coherent  classifier  of  an  algebraic  theory  is  the 
coherent  completion  of  its  Lawvere  clone.  The  coherent  classifiers  have  a  richer  set 
of  objects,  in  order  to  impose  the  preservation  of  more  general  axioms;  but  simpler 
theories  can  be  captured  by  smaller  classifiers. 

®  By  abuse  of  notation,  a(z),  j3{y)  and  '&{x,y)  denote  their  equivalence  classes  [q],  [/3]  and  [1?]  modulo  ~. 
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4.2.3.  Interpretations  and  models  The  upshot  of  coherent  classifying  categories 
is  thus  that  the  coherent  functors,  preserving  the  coherent  structure,  preserve  the 
coherent  logic,  and  thus  enforce  the  satisfaction  of  the  coherent  theorems,  represented 

as  the  morphisms  in  coherent  categories.  A  coherent  functor  C7 - y  Cu  can  thus  be 

viewed  as  a  sound  interpretation  of  the  theory  T  in  the  theory  U.  But  since  every 
small  coherent  category  A  can  be  obtained  as  the  classifier  C7-  of  some  coherent 
theory  T,  every  coherent  functor  F  :  A  — >  B  can  be  understood  logically,  as  such 
an  interpretation. 

Although  it  is  not  small.  Set  has  all  the  coherent  structure,  and  the  coherent 

functors  C7 - >  Set  are  exactly  the  models  of  T.  The  natural  transformations  are  the 

T-homomorphisms,  preserving  all  the  definable  operations.  For  every  small  coherent 
A,  we  shall  denote  by  Mod  (A)  the  category  of  coherent  functors  A  — >  Set.  This  is 
the  category  of  models  of  A.  As  pointed  out  before,  categories  of  the  form  Mod  (A)  are 
accessible,  and  by  allowing  infinite  disjunctions,  one  could  get  (an  equivalent  version 
of)  every  accessible  category  in  this  form  [2,  ch.  5]. 

On  the  other  hand,  by  precomposition,  every  coherent  functor  F  :  A  — >  B  induces 
a  “reduct”  F*  :  Mod(B)  — >  Mod(A),  reinterpreting  a  model  N  :  B  — >  Set  of  B  as 
a  model  NF  :  A  — >  Set  of  A.  This  is  the  arrow  part  of  the  Mod-construction,  which 
yields  an  indexed  category  Mod  :  Coh°^  — >  CAT,  where  Coh  is  the  category  of  small 
coherent  categories  and  functors,  and  CAT  is  the  metacategory  of  categories.  Mod 
thus  assigns  a  semantics  to  each  coherent  theory  T,  classified  by  a  coherent  category 
CV;  in  other  words,  it  maps  each  theory  T  to  its  category  of  models,  captured  as 
coherent  functors  C7 - >  Set. 

The  semantical  functor  Mod  is  an  instance  of  a  specification  frame  in  the  sense 
of  Ehrig  and  Grofie- Rhode  [8].  Specification  frames  are  indexed  categories,  construed 
as  some  abstract  model  category  assignments,  like  Mod.  In  these  terms,  Ehrig  and 
Grofie-Rhode  proposed  a  semantical  definition  of  parametric  specifications,  which  will 
be  analyzed  in  the  sequel. 

4.2.4.  Parametrized  specifications  as  functors: 

syntactic  vs  semantic  definitions  A  reader  unfamiliar  with  coherent  logic  may 
wish  to  write  down,  as  a  quick  exercise,  say,  the  coherent  theories  of  fields  and  vector 
spaces  and  analyze  their  classifying  categories.  The  classifying  category  Field  is  of 
course  a  subcategory  of  the  classifying  category  VecSp.  The  obvious  functor  Field  M- 
VecSp  is  full  and  faithful.  This  means  that  the  theory  of  vector  spaces  is  conservative 
over  the  theory  of  fields:  no  new  theorems  about  the  scalars  can  be  proved  using  the 
vectors.  Moreover,  Field  <— >■  VecSp  is  also  a  powerful  functor:  each  subobject  of  an 
object  in  the  image  is  also  in  the  image.  This  means  that  every  predicate  on  scalars, 
expressible  in  the  theory  of  vector  spaces,  is  already  expressible  in  the  theory  of  fields. 
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The  embedding  Field  M-  VecSp  is  a  typical  parametric  specification,  defined  syn¬ 
tactically,  as  in  [12, 13].  Viewed  in  the  setting  of  classifying  categories,  a  parametric 
specification  is  thus  a  coherent  functor  F  :  A  — >  B,  which  is  full,  faithful  and  pow¬ 
erful. 

On  the  semantic  side,  as  already  mentioned,  Ehrig  and  Grofie-Rhode  [8]  have 
proposed  an  abstract  definition  of  parametricity,  applicable  to  the  functor  Mod  : 
Coh"^  — >  CAT.  Omitting  the  presentation  details,  a  parametric  specification  is,  ac¬ 
cording  to  this  definition,  an  interpretation  F  :  A  — >  B,  such  that  the  induced  func¬ 
tor  :  Mod(B)  — >  Mod(A)  is  a  retraction,  i.e.  there  is  a  functor  ^  :  Mod(A)  — > 
Mod(B)  with  =  Id.  In  words,  ^  maps  each  model  M  of  the  parameter  A  into  a 

model  N  =  of  the  body  B  in  such  a  way  that  the  forgetful  functor  F^  restores  an 
isomorphic  copy^  of  M.  Such  a  functor  <?,  which  nondestructively  expands  a  model, 
is  said  to  be  persistent  [9,  sec.  lOBj. 

In  the  present  paper,  we  shall  show  that  the  above  two  definitions  are  roughly 
equivalent:  a  coherent  functor  F  :  A  — >  B  is  full,  faithful  and  powerful  if  and  only  if 
F*  :  Mod{B)  — >  Mod (A)  is  a  retraction,  in  the  strong  sense  that  every  splitting  of 
its  object  part  can  be  refined,  by  taking  quotients,  into  a  splitting  functor. 

Completeness  view. 

When  an  indexed  family  of  sets  {Bx\x  G  A}  is  represented  as  a  function  /  : 
B  — >  A,  with  Bx  =  an  indexed  element  b  €  Y\.xeA^x  becomes  a  splitting 

(f) :  A  — B,  /  o  0  =  id,  with  =  ^{x)  E  B^. 

Similarly,  a  specification  B  parametrized  over  A  can  be  thought  of  as  a  family 
of  the  instances  of  B  indexed  over  the  instances  of  A.  In  particular,  the  functor 
F’^  :  Mod(B)  — >  Mod(A)  can  be  construed  as  a  family  of  B-models  indexed  over 
A-models.  A  splitting^  :  Mod(A)  — >  Mod{B),  F*o^  =  Id,  then  becomes  an  indexed 
model  of  B,  parametrized  over  A. 

According  to  this  view,  a  persistent  functor  is  thus  an  indexed  model.  The  para¬ 
metricity  of  theories  lifts  to  the  parametricity  of  their  models:  the  semantical  definition 
of  parametric  specification,  described  above,  boils  down  to  the  requirement  that  there 
is  a  parametric  model  of  the  body  indexed  over  the  models  of  the  parameter. 

The  equivalence  of  the  semantic  and  the  syntactic  definitions  of  parametricity, 
which  we  are  about  to  establish,  thus  becomes  a  soundness-and-completeness  theorem, 
in  indexed  form. 

*  The  original  definition  actually  requires  that  M  is  recovered  on  the  nose,  i.e.  that  the  strict  equality 
F*  o  $  =  Id  holds.  But  in  abstract  fiinctorial  calculus,  this  is  almost  never  possible. 
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4.3.  Syntactic  vs  semantic  properties  of  functors 

4.3.1.  Preliminaries  We  begin  by  listing  some  useful  terminology  and  facts  from 

the  general  functor ial  calculus. 

Definition  1.  A  functor  F  :  A  — >  B  is  said  to  be 

embedding:  if  it  is  full  and  faithful; 

subcovering:  if  for  every  object  B  G  B  there  is  a  finite  diagram  D  in  B,  such  that 
(1)  B  is  the  colimit  of  D,  and  (2)  for  every  node  Di  of  D  there  is  some  Ai  in  A 
and  a  monic  Di  >-^  FAi  in  B; 

subobject  covering:  if  every  B  G  B  is  a  subobject  of  some  FA,  A  ^  A  (in  other 
words,  if  it  is  subcovering  and  the  diagrams  D  can  be  chosen  to  have  one  node  and 
no  edges); 

powerful:  if  all  subobjects^  of  FA  in  B  lie  in  the  image  of  F .  More  precisely,  for 

every  monic  D  ^  FA  in  B  there  is  a  monic  S  ^  A  in  A  and  an  isomorphism 
i  :  D  FS  such  that  d  =  Fs  o  i; 

retraction:  if  it  has  a  right  inverse  (i.e.,  there  is  (7  :  B  A  with  FGM  =  M  for 
all  Af  G  A); 

uniform  retraction:  if  it  is  a  retraction,  and  every  splitting  of  its  arrow  part  refines 
to  a  right  inverse  (more  precisely,  if  F  :  |A|  — >  |B|  is  such  that  FFM  =  M, 
M  G  A  then  there  is  a  functor  G  ;  A  — >■  B,  where  GM  is  a  quotient  of  FM  and 
FGM^M); 

(co) reflection:  if  it  has  a  right  inverse  right  (resp.  left)  adjoint. 


Lemma  1.  A  powerful  and  subobject  covering  functor  is  essentially  surjective. 

Lemma  2.  F  :  A  — >  B  is  faithful  if  and  only  if 

Ficp)hF{^f)^<p\-^P  (1) 

As  the  converse  of  (1)  is  always  true,  a  faithful  coherent  functor  F  always  induces  an 
“order  isomorphism”  on  the  subobject  lattices. 

To  prove  lemma  2,  use  the  fact  that  ^  h  ^  if  and  only  ii  (p  =  (p  Aip. 

Proposition  1.  A  coherent  functor  must  be  full  as  soon  as  it  is  both  faithful  and 
powerful. 

®  Recall  that  subobjects  are  isomorphism  classes  of  monies. 
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Proof.  Since  F  is  powerful,  the  graph  of  any  h  :  FA  — >  FA'  must  be  in  the  essential 
image  of  F:  there  must  be  a  monic  Ax  B  in  A  the  F-image  of  which  is  isomorphic 
with  the  graph  x  =  (id,  h)  :  FA  — >  FA  x  FB.  The  relation  Fk  thus  satisfies; 

5fa  b  Fk  ;  Fk°^ 

Fk°^  ^FkY- 5 fb 

which  respectively  tell  that  it  is  total  and  single  valued.  Taking  into  account  that  for 
the  identity  relation  S  =  (id,  id)  holds  Sfx  —  F5x,  and  using  (1),  we  conclude  that  k 
is  a  total  and  single  valued  relation  in  A.  In  any  regular  category,  such  a  relation  must 
be  isomorphic  to  one  in  the  form  (id,  k)  :  A  — >  AxB.  Since  clearly  F(id,  k)  =  (id,  h), 
we  conclude  that  Fk  =  h.  D 

4.3.2.  Basic  results  In  the  sequel,  we  assume  that  F  :  A  — >  B  is  a  coherent 
functor  between  coherent  categories,  and  :  Mod(B)  — >  Mod  (A)  is  the  functor 
induced  by  the  precomposition.  We  use  and  extend  some  results  from  [22].  Note 
that  some  of  them  essentially  depend  on  strong  model  theoretic  assumptions,  such 
as  compactness.  The  proofs  are  thus  largely  non-constructive,  as  they  depend  on  the 
axiom  of  choice. 

Proposition  2.  F  is  faithful  if  and  only  if  F*  is  essentially  surjective. 

Proof.  By  lemma  2,  F  is  faithful  if  and  only  if 

Ftp  h  Fip  (p\-  Ip 

By  the  completeness  theorem  [22,  thm.  5.1.7]  Fp  h  Fip  holds  if  and  only  if 

VN'  e  Mod(B).  NFp  C  NFip 
whereas  (p\-  ip  holds  if  and  only  if 

VM  G  Mod(A).  M(p  C  Mip 

The  last  two  statements  are  clearly  equivalent  if  F*  is  essentially  surjective,  i.e. 

VM  G  Mod  (A)  3  G  Mod(B).  M  ^  F*N 

Conversely,  if  there  is  M  G  Mod  (A)  different  from  F'^A/^  for  all  N  G  Mod(B),  one  can 
use  compactness  to  construct  a  formula  ip  such  that  NFip  is  true  for  all  models  N  of 
B,  whereas  Mip  is  not.  171 
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Definition  2.  :  Mod(B)  — v  Mod(A)  is  said  to  be  subfull  if  every  A-homomorphism 

h  :  F'^N'  — >  F'^N"  preserves  all  M- subobjects,  i.e.  for  every  monic  D  >— »  FA  in  B 
holds 

hA{N'D)  C  N"D 


N'FA - — ^N"FA 


Proposition  3.  F  is  powerful  if  and  only  if  is  subfull. 

Proof.  By  definition,  F  is  powerful  if  and  only  if  every  D  FA  is  in  the  essential 

image  of  F,  i.e.  d  =  Fs  for  some  S'  A.  So  (2)  must  commute  because  it  is  isomorphic 
with  the  square 


N'FA - — ^N"FA 


which  commutes  by  the  naturality  of  h. 

The  other  way  around,  the  fact  that  the  subfullness  of  i.e.  the  commutativity 
of  squares  (2)  implies  that  F  is  powerful  is  one  of  the  main  constituents  of  the  Makkai- 
Reyes  conceptual  completeness  theorem  [22,  ch.  7§1].  The  proof  can  be  extracted  from 
[22,  thms.  7.1. 4-4’],  and  essentially  depends  on  compactness.  □ 

Proposition  4.  F  is  subcovering  if  and  only  if  F’^  is  faithful. 

Proof  Suppose  F  is  subcovering  and  let  F*g  -  F*h  for  some  B-homomorphisms 
g,h  :  N'  — )■  N".  The  equation  F*g  =  F*h  means  that  gFA  =  hFA  :  N'FA  — )• 
N"FA  for  all  AeA. 

I  claim  that  then  gB  =  hB  :  N'B  — N"B  must  hold  for  every  F  e  B.  Since  F 
is  subcovering,  for  each  B  there  is  a  finite  diagram  D,  with  (1)  a  colimit  cocone  to  F, 
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i.e.  a  jointly  epimorphic  family  {Di  %  and  (2)  the  inclusions  {Di  >-^ 

for  some  objects  Ai, ...  An  €  A.  Hence 


J!  T7t  A 


N'FAi 


N'di 


hFAi 


gDi 


■  N"FAa 


(3) 


N''di 


N'bi 


N'B 


hDi 


gB 


N'Di  : . i:-d:  N^'D^ 


N"bi 


^N"B 


hB 


Naturality  of  g  and  h  now  yields 

N"diO  gDi  =  gFAiO  N'di 
=  hFAi  o  N'di 
=  N"di  o  hDi 

But  since  models  are  left  exact,  each  N"di  is  still  a  monic,  and  therefore  gDi  =  hDi, 
for  alH  =  1, . . . ,  n. 

Using  naturality  again,  we  get 

gB  o  N'bi  =  N"bi  o  gDi 
=  N"bi  o  hDi 
=  hBo  N'bi 

But  since  models  preserve  the  finite  unions  of  subobjects  must  be  jointly 

monic  again,  and  therefore  gB  =  hb-  Thus  g  =  h,  and  F"^  is  faithful. 

For  the  converse,  one  assumes  that  there  is  B  G  B  not  subcovered  by  F,  and, 
using  compactness,  constructs  models  N'  and  N"  and  two  homomorphisms  g  ^  h  : 
N'  — >■  N"  such  that  F'^g  =  F*h.  The  details  are  in  [22,  thms.  7.1. 6-6’].  □ 


Logical  meaning. 

Proposition  2  tells  that  each  A-model  extends  back  along  F^  to  some  B-model 
if  and  only  if  F  ;  A  — y  B  is  faithful.  However,  this  does  not  guarantee  that  every 
A-homomorphism  between  A-models  will  extend  to  a  B-homomorphism  between  their 
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extensions.  Indeed,  according  to  proposition  3,  a  necessary  condition  for  this  is  that 
F  :  A  — y  B  is  powerful. 

Together,  these  conditons  provide  a  basis  for  aligning  syntactical  and  the  seman¬ 
tical  definitions  of  parametricity,  as  described  in  section  4.2.4.. 

4.4.  Characterizing  parametric  specifications 

Theorem  1.  For  a  coherent  functor  F  :  A  — >  B  and  the  induced  “reduct”  : 
Mod(B)  — >  Mod(A),  the  following  statements  are  equivalent. 

(a)  F  is  a  powerful  embedding. 

(b)  F*  is  subfull  and  essentially  surjective. 

(c)  F*  is  a  uniform  retraction. 

If  Mod(B)  has  coproducts,  then  the  above  conditions  are  also  equivalent  with 

(d)  F*  is  coreflection. 

Note  that,  since  Mod(B)  is  finitely  accessible,  it  has  coproducts  if  and  only  if  it 
is  locally  finitely  presentable,  i.e.  when  B  cleissifies  an  essentially  algebraic  theory  [2, 
sec.  3D]. 

Proof.  (a)4^(b)  By  proposition  1,  it  suffices  to  check  that  F  is  faithful  and  powerful. 
By  proposition  2,  F  is  faithful  if  and  only  if  F*  is  essentially  surjective.  By  proposition 
3,  F  is  powerful  if  and  only  if  F*  is  subfull. 

To  simplify  the  proof  of  (b):4-(c),  we  shall  freely  use  the  established  equivalence 
(a)4^(b).  Given  that  F*  is  essentially  surjective  and  subfull,  we  thus  know  that  F 
is  full,  faithful  and  powerful.  Using  all  that,  we  define  ^  :  Mod(A)  — >  Mod(B),  such 
that  F*o^^  Id. 

Since  is  essentially  surjective,  for  every  M  in  Mod  (A),  there  is  some  L  in 
Mod(B)  such  that  M  =  F*L.  But  the  homomorphisms  to  or  from  M  may  not  extend 
to  every  such  L,  so  we  cannot  simply  take  =  L. 

On  the  other  hand,  like  any  functor,  M  :  A  — >  Set  has  the  right  Kan  extension, 
a  functor  F#M  :  B  — y  Set  [18],  defined 

F#M(F)  =nm  MoCod(F/F)  (4) 

where  B/F  is  the  comma  category,  spanned  by  the  arrows  in  the  form  B  — ^  FA  in 
B.  A  morphism  from  B  FA  to  B  FC  is  an  arrow  g  :  A  — >  C  in  A  such 
that  Fg  o  a  =  c.  The  image  of  F  €  B  along  F^f^M  is  thus  the  limit  of  the  diagram 
B/F  ^  A  A  Set. 
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The  construction  is  functorial  and  it  is  not  hard  to  see  that  =  Id  holds 

if  and  only  if  F  is  faithful.  So  F#M  might  be  a  candidate  for  $M.  But  the  assumption 
that  M  :  A  — >  Set  is  coherent  does  not  generally  follow  for  F^M  :  B  — >  Set.  The 
F#-image  of  an  A-model  M  may  not  be  a  B-model,  and  the  functor  F^  :  Set^  — >  Set® 
does  not  restrict  to  a  functor  Mod(A)  — >  Mod(B). 

But  the  desired  model  :  B  — >  Set  can  actually  be  “interpolated”  between 
the  Kan  extension  F#M  :  M  — >  Set,  and  the  arbitrary  model  L  :  B  — >  Set  such 
thatF#L^M. 

First  of  all,  since  F^  H  F^,  every  F^F  — >  M  induces  L  — >  Given,  as 

above  M  =  F^L,  for  every  a  :  B  — >  FA  in  B,  there  is  La  :  LB  — >  LF A  =  MA. 
Hence  a  cone  {La)aQB/F  '■  LB  — >  MoCod  {B/F).  By  definition  (4),  this  cone  induces 
a  unique  arrow  :  LB  — )■  F:fj:M{B). 

Let  the  functor  FM  :  B  — >  Set  be  defined  as  the  monic  image  of  0  :  L  — V  F:f^M, 
i.e. 


^B  :  LB — ^^M{BY - ^F#M{B)  (5) 

This  will  indeed  be  a  model.  Although  F^M  :  B  — >  Set  is  not  a  model,  when 
F  ;  A  — y  B  and  M  :  A  — y  Set  preserve  (finite)  limits,  then  F#M  :  B  — y  Set 
weakly  preserves  them:  for  every  (finite)  diagram  A  :  I  — y  B,  the  set  F^M(lim  A) 

is  a  weak  limit  of  F^M{A)  and  thus  contains  lim  F^M{A)  as  a  retract. 

Together  with  the  coherence  of  F  :  B  — y  Set,  this  weak  preservation  property  of 
F#M  suffices  for  the  coherence  of  :  B  — y  Set.  E.g.,  it  preserves  the  products 
because  the  map  from  ^M{B)  x  ^M{D)  to  ^M{B  x  D)  on 

LB  X  LD  — ^  ^MiB)  X  ^M(D)c - >  F#M(F)  x  F#M{D)  (6) 

I  , 


+  Y  Y 

F(F  X  D) - ^L>M{B  X  D)^ - ^F#M{B  x  D) 

must  be  both  surjective  and  injective. 

The  object  part  of  :  Mod(A)  — y  Mod(B)  is  thus  determined  by  (5).  Notice  that 
^  is  not  unique,  as  the  definition  depends  on  the  choice  of  F,  F’^F  =  M. 

To  define  the  arrow  part  of  take  an  arbitrary  A-homomorphism  h  :  M'  — y  M" 
.  It  surely  induces  a  natural  transformation  F^h  :  F^fj^M'  — y  F:f^M",  and  we  can  find 
B-models  F'  and  L"  that  map  by  F*  to  M'  and  M",  and  determine  B-models  ^M' 
and  but  h  :  M'  — y  M"  in  general  does  not  lift  to  a  homomorphism  F'  — y  L" . 
However,  :  L>M'  — y  ^M"  can  be  derived  from  F^F  :  F^M'  — y  F#M"  alone. 
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To  simplify  notation,  write  N'  =  $M'  and  N"  =  ^M"  and  k  =  for  the  desired 
homomorphism. 

We  are  given  a  natural  family  hA  :  M' A  — >  M"A  and  we  want  to  extend  it  to 
kB  ;  N'B  — >  N"B,  so  that  kFA  =  hA.  In  other  words,  we  have  the  subfamily  of 
functions  kFA  :  N'F A  — >  N"FA,  A  G  A,  and  we  need  to  complete  it  to  a  natural 
family  kB  :  N'B  — >  N"B,  B  eB. 

Consider  first,  for  each  5  G  B,  the  set  £b  of  regular  epimorphisms  e  :  B  — »  FAg 
in  B.  The  e-th  component  of  the  limit  cone  F#M{B)  — >  M oCod{B/F)  is  a  function 
F:jj,M{B)  — >  MAe-  Hence  the  map 

(7) 

Since  F  :  A  — )■  B  is  powerful,  this  map  is  injective.  By  postcomposing  (5)  with  it, 
one  gets 


(Le)ee£,  :  LB—^^M{B)^ - ^Y[^^,^LFAe  (8) 

because  MAe  =  LFAg.  Of  course,  since  L  is  coherent,  each  Le  :  LB  — y  LFAg  is  a 
surjection.  The  set  L>M{B)  can  thus  also  be  obtained  by  taking  the  product  of  all  sets 
LFAe,  such  that  there  is  some  regular  epi  e  :  B  — y  F A^  in  B,  and  then  extracting 
from  this  product  the  image  of  the  tuple  formed  by  all  epis  Le  :  LB  — »  LF A^. 

The  construction  of  kB  :  N'B  — y  N"B  now  proceeds  by  the  following  steps; 

(i)  define  a  function 


kB  :  N'B 


p(N"B) 


such  that 


kFA{x)  =  {hA{x)} 

(ii)  show  that  kB{x)  is  nonempty  for  every  x  G  N'B\ 

(iii)  show  that  kB{x)  has  at  most  one  element  for  every  x  G  N'B]  writing  kB{x) 
for  the  only  element  of  kB{x),  we  get  the  function  kB  :  N'B  — y  N"B] 

(iv)  prove  that  the  obtained  family  kB  :  N'B  — y  N"B,  B  G  B  is  natural,  i.e. 
forms  k  :  N'  — y  N". 

(i)  Using  the  same  set  £b  of  regular  epis  e  :  B  — »  FA  as  above,  define 

k^B{x)  =  {N"e)-^  ohAo  N'e{x) 
kB{x)  =  Pi  k^B{x) 
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(9) 


For  B  =  FA,  k"^FA{x)  =  {/iy4(a;)}.  Moreover,  for  every  e  G  SpA  holds 

K'^^FAix)  C  k^FA{x) 

Indeed,  since  F  is  full,  the  naturality  of  h  implies  that  the  square 


N'FA - ^  N"FA 


N'FA - ^  N"FA 


commutes.  Hence  (9),  and  thus  kFA{x)  =  {hA{x)},  as  asserted. 

(ii)  For  every  H  G  B,  the  set  £b  is  nonempty  because  it  surely  contains  the  regular 
epi  part  B  -»  FI  >— >  FI  =  1.  FI  is  terminal  because  F  is  coherent;  the  regular  image 
of  B  — FI  is  in  the  image  of  F  because  F  is  powerful. 

Moreover,  since  N"  is  coherent,  and  B  — ^  FI  is  a  cover  (regular  epi)  N"B  — >■ 
N"FI  must  be  a  surjection.  So  if  N"B  is  empty,  N"FI  must  be  empty,  and  hence 
N'FI  must  be  empty,  because  there  is  a  function  hi :  N'FI  — >  N"FI.  But  there  is 
also  a  function  N'B  — >  N'FI,  and  thus  N'B  must  be  empty  as  well,  so  there  is  a 
unique  kB  :  N'B  — >  N"B,  and  we  are  done. 


With  no  loss  of  generality,  we  can  thus  assume  that  N"B  is  nonempty.  Since 
N"e  :  N"B  )■  NFA,  e  G  5b,  is  a  surjection,  all  NFA  are  nonempty,  and  moreover, 
every  k^{x)  =  {N"ey^  ohAo  N'e{x)  is  nonempty. 

Finally,  for  any  cq  :  B  — ^  F Aq  and  ei  :  B  — )■  F Ai  from  5b  the  intersection 
n  is  nonempty  as  well.  Toward  a  proof,  consider  the  pair  (cq,  ci)  :  B  — >• 
FAq'x.FAi  =  F{Aq  X  Ai)  in  B.  Factoring,  and  using  once  again  the  assumption  that 
F  is  powerful,  we  get  63  :  B  — »■  FA2,  with  a  pair  (po,Pi)  :  ^2  — ^  Aq  x  in  A  such 
that  Cj  =  Fpi  o  62,  ?  =  0, 1.  But  N"ei  =  N"Fpi  o  N"e2  implies 


K^^{x)  C  K^°[x)  D  K^^{x) 

for  all  X  G  N'B.  Since  k^^{x)  has  been  proved  nonempty,  k^°{x)  fl  k^'^{x)  is. 

A  similar  reasoning  applies  to  any  finite  intersection  of  k^s.  But  for  the  quotients 
e  G  5b  in  a  coherent  category  B  the  compactness  applies:  if  any  finite  family  is 
consistent,  then  the  whole  set  together  is.  Therefore,  k,B{x)  is  nonempty. 

(iii)  So  we  can  surely  chose  kB{x)  G  kB{x).  No  matter  which  element  we  choose,  the 
equation 


N"e  okB  =  kFA  o  N'e 


(10) 
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will  hold  for  every  e  e  Sb,  because  kFA  =  hA  and  the  definition  of  kB  implies 

N"e  o  kB  =  hA  o  N'e 

On  the  other  hand,  recall  that  N"B  =  ^M"B  was  defined  so  as  to  make  the  function 
(iV"e)  egfg  injective.  But  this  means  that  the  set  of  equations  (10),  for  all  e  E  £b, 
together  determine  at  most  one  kB{x),  since  the  functions  N"e  are  jointly  injective. 

So  the  family  hA  :  N'FA  — )■  N"FA,  A  E  A,  extends  to  a  uniquely  determined 
family  kB  :  N'B  — >  N"B,  B  eM. 

(iv)  To  prove  that  the  family  kB  :  N'B  — )■  N"B  is  natural,  take  an  arbitrary  arrow 
g  :  Bo  — y  Bi  in  B  and  an  arbitrary  ei  ;  Bi  — »■  FAi  from  £bi  ■  Let  eo  be  the  coimage 
oi  eio  g 

Bo - ^ (11) 


FAo^ - - - ^FAi 

The  codomain  of  cq  is  in  the  image  of  F  because  it  is  powerful. 
We  want  to  prove  that  the  upper  square  in  the  diagrain 

N'Bo - — - ^N"Bo 


commutes.  The  lower  square  and  the  large  outside  trapezoid  surely  commute  by  the 
definition  of  kB.  The  small  trapezoid  commutes  by  the  naturality  of  h,  and  the  two 
triangles  simply  as  the  images  of  (11).  Chasing,  one  concludes  that 

N"ei  o  kBi  o  N'g  =  N"'ei  o  N" g  o  kBo 
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But  ei  was  taken  as  an  arbitrary  element  of  Sbi,  so  the  last  equation  holds  for  all 
such.  Since  they  are,  by  the  construction  of  N"  =  jointly  monic, 

kBi  o  N'g  =  N"g  o  kB^ 


follows. 

This  completes  the  proof  of  (b)=>'(c).  The  converse  (c)=4>(b)  can  be  proved  by 
modifying  [22,  thm.  7.1. 4-4’].  The  argument  is  lengthy,  based  on  the  Los-Tarski  the¬ 
orem,  and  I  do  not  see  a  way  to  improve  on  it,  so  the  reader  may  wish  to  consult  the 
original. 

Finally,  to  connect  (d)  with  the  other  three  conditions,  note  that  the  assumption 
of  coproducts  makes  Mod(B)  into  a  locally  finitely  presentable  category,  so  that  F* 
must  have  a  left  adjoint,  like  in  [10,  §  5],  obtained  by  restricting  the  left  Kan  extension 
of  F.  Hence  (d)-^(a).  But  a  proof  of  this  was  already  in  [11]  and  [15],  albeit  in  a 
slightly  different  setting.  □ 

An  immediate  consequence  of  theorem  1  and  proposition  4  is  a  precise  syntac¬ 
tic  characterisation  of  definitional  extensions,  the  interpretations  F  which  induce  an 
equivalence  F*  between  the  model  categories.  The  class  is  essentially  larger  than 
assumed  in  any  of  the  implemented  versions. 

Corollary  1.  F*  :  Mod(B)  — >  Mod(A)  is  an  equivalence  if  and  only  if  F  :  A  — >■  B 
is  a  powerful  embedding,  and  subcovering. 

A  proof  of  this  can  also  be  derived  from  Makkai- Reyes’  conceptual  completeness 
theorem  [22,  thm.  7.1.8],  which  is  the  main  result  of  their  book. 

4.5.  Conclusions  and  further  work 

The  research  reported  in  this  paper  was  originally  motivated  by  the  questions  arising 
from  the  semantics  and  the  usage  of  Spec  ware™,  a  tool  for  the  automatic  syn¬ 
thesis  of  software  systems,  developed  at  Kestrel  Institute.  In  particular,  the  original 
semantics  of  pspecs  as  an  abstract  family  of  arrows  [29]  needed  to  be  refined  into 
a  precise  syntactic  characterisation  and  verified  semantically.  This  task  took  us  far 
afield,  into  nontrivial  model  theory  and  functorial  calculus,  and  brought  about  the 
above  theorem  relating  two  extant  notions  of  parametricity.  As  suggested  at  the  end 
of  section  4.2.4.,  it  can  be  viewed  as  an  indexed  completeness  result.  Formalizing  this 
view  might  lead  to  various  conceptual  and  meta-theoretical  insights. 

But  the  question  of  the  practical  repercussions  of  the  presented  material,  or  of 
their  absence,  seems  even  more  interesting.  The  immediate  task  should  probably  be 
to  analyze  closely  related  families  of  coherent  functors,  capturing  the  instantiations 
and  the  implementations  of  theories.  The  practice  of  parametric  specification  is  based 
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upon  them  as  much  as  upon  the  family  of  pspecs,  studied  in  the  present  paper.  Some 
important  issues  of  refinement  directly  require  this  further  analysis. 

However,  as  we  are  not  very  far  in  any  of  these  tasks,  the  main  point  of  present¬ 
ing  this  work  currently  is  not  this  or  that  particular  result,  but  showing  categorical 
model  theory  at  work  in  the  software  specification  framework  and  suggesting  a  first 
step  or  two  toward  developing  specific  tools  for  analyzing  and  designing  specification 
frameworks. 

If,  as  is  often  stated,  the  increasing  complexities  and  dynamics  of  evolving  software 
development  tasks  make  semantical  analyses  increasingly  important,  even  indispens¬ 
able  in  critical  cases,  then  mathematical  methods  of  the  kind  presented  here  may 
come  to  play  an  increasingly  important  role,  as  they  may  provide  enough  abstraction 
to  resolve  the  concrete  problems  where  formal  methods  are  genuinely  needed. 

5.  Other  mathematical  methods 

In  this  section,  we  present  three  methods  for  structuring  theories;  limits,  interpreta¬ 
tions,  and  slicing.  Limits  allow  us  to  find  the  “semantic  intersection”  of  a  collection 
of  a  theories,  interpretations  allow  us  to  relate  specifications  in  a  more  general  way, 
and  slicing  allows  us  to  split  a  theory  into  a  collection  of  meaningful  parts. 

5.1.  Limits 

The  category  of  specifications  in  Specware  has  all  finite  colimits.  Not  only  are  they 
useful,  but  they  are  efficiently  computable  with  a  linear  time  algorithm.  Do  limits  of 
specifications  exist?  Would  they  be  useful?  Can  they  be  computed  efficiently  when 
they  exist?  As  a  special  case,  what  is  the  product  of  two  specifications?  what  it  would 
mean?  how  could  it  be  used?  Some  potential  uses  of  limit  computations  include: 
slicing,  evolution,  filling  out  a  partial  parallel  refinement. 

There  is  a  general  result  about  categories  that  provides  a  remarkable  inductive 
computation  of  limits  provided  certain  basic  special  cases  exist. 

Proposition  1.  If  a  category  has  a  final  object  and  all  pullbacks,  then  it  has  all 
finite  limits. 

Proof:  It  is  fairly  easy  to  show  that  if  a  category  has  a  final  object,  products,  and 
equalizers  then  it  has  all  limits  [4].  It  can  be  further  shown  that  if  a  final  object  and 
pullbacks  exist,  then  we  can  compute  products  and  equalizers;  e.g.  the  product  of  two 
objects  A  and  B  is  the  pullback  of  the  diagram  A  — >1-! —  B  where  1  is  the  final 
object. 
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The  diagram  on  the  left  gives  a  pullback  situation.  Given 
specs  A,  B,  and  C  and  morphisms  A  —  B  ,  does 

a  cone  P  -^B  exist  such  that 

(1)  the  diagram  commutes:  g  o  i  =  h  o  j  and 

(2)  the  cone  is  universal:  for  any  other  cone 

A-^-^  Q  —^B  (such  that  g  o  k  =  h  o  £  there  exists 
a  unique  arrow  u  :  Q  P  that  factors  k  and  i.e.  such 
that  k  =  iou  and  £  —  j  ou.  “ 

“  The  definition  of  cone  includes  commuting  of  diagram. 

We  approach  the  question  of  whether  limits  of  specifications  exist  by  examining 
incrementally  the  following  categories  of  interest: 

SIG  =  Category  of  signatures  and  their  morphisms;  A  signature  consists  of  a  set  of 
sort  symbols  and  a  set  of  (higher-order)  operator  symbols  together  with  their  arities. 
A  signature  morphism  maps  each  symbol  of  the  domain  signature  to  a  symbol  of  the 
codomain,  such  that  the  arities  of  each  operator  symbol  is  preserved  under  translation. 
For  example,  suppose  that  the  domain  has  sorts  D,  and  R  and  operator  / ,  and  the 
codomain  has  sorts  E,  and  S  and  operator  g.  The  map  m  is  a  signature  morphism: 
m  =  {/  5,  D  E,  R\-^  S}.  Note  that  m  ensures  compatible  translation  of  the 
arity  of  /.  To  put  it  another  way,  a  signature  morphism  is  required  to  preserve  the 
sort  constructions  of  a  signature,  whether  these  constructions  arise  in  giving  the  arity 
of  an  operator  or  in  explicit  sort  definitions. 

A  signature  morphism  translates  an  expression,  such  as  an  axiom,  by  context-free 
translation  of  constituent  symbols. 

For  simplicity  we  assume  that  the  arity  of  an  operator  is  built  up  from  products 
and  function  (exponentials).  We  assume  that  the  boolean  sort  boolean  and  the  logical 
quantifiers  and  operators  V,  3,  A,  V,  =>,  are  built  in.  All  sorts  are  equipped 
with  an  equality. 

THY  =  Category  of  theories  and  their  morphisms;  i.e.  an  object  in  THY  is  a  sig¬ 
nature  plus  a  set  of  sentences  that  are  closed  under  entailment  (called  the  theorems  of 
the  theory).  The  morphisms  are  essentially  signature  morphisms,  with  the  additional 
condition  that  they  translate  theorems  to  theorems. 

SPEC  =  the  category  of  (higher-order)  specs  and  their  morphisms.  An  object  in 
SPEC  is  a  signature  plus  a  finite  (or  more  generally  recursive)  set  of  sentences  called 
the  axioms  of  the  specification.  The  morphisms  are  essentially  signature  morphisms, 
with  the  additional  condition  that  they  translate  axioms  to  theorems.  This  condition 
implies  that  the  morphisms  also  translate  theorems  to  theorems. 

Say  something  about  other  sort  constructions  such  as  sums,  quotients,  and  sub¬ 
sorts?  Same  general  idea,  but  the  final  object  is  just  more  complex? 
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Proposition  2.  SIG  has  all  finite  limits. 

Proof;  We  need  to  show  that  SIG  has  a  final  object  and  pullbacks.  The  final 
object  IsiG  consists  of  one  sort  (say  D)  and  one  operator  for  each  finite  arity  (i.e. 
fjn,n  •  D'^  — >■  where  m  >  0  A  n  >  0).  To  show  universality,  note  that  there  is 

a  unique  arrow  from  an  arbitrary  signature  S  IsiG  which  takes  each  sort  to  D 
and  each  operator  of  S  to  the  unique  operator  in  Isig  that  guarantees  signature 
computability. 

The  pullback  of  signatures  is  the  fiber  product  over  C  of  the  sorts  and  ops.  If  7 
is  a  sort  of  C,  then  the  fiber  over  7  is  ^“^(7)  x  h~^{'y).  The  elements  of  this  fiber  are 
pairs  of  sorts,  say  {a,/3),  which  can  be  thought  of  as  the  sort  product  a  x  jd.  If  c  is 
an  operator  of  C,  then  the  fiber  over  c  is  g~^{c)  x  h~^{c).  The  elements  of  this  fiber 
are  pairs  of  operators,  say  {a  :  D  R,b  :  E  S),  which  can  be  thought  of  as  the 
function  product  axb:DxE—^RxS  such  that  a  x  b  :  {d,e)  {f{d),g{e)). 

To  show  universality,  let  ^  Q  -  be  another  cone.  Define  the  universal 

arrow  u  :  Q  ^  P  as  follows.  For  sort  cr  in  Q,  let  u  :  a  o;  x  /3  if  k{q)  =  a.  and 
=  13.  For  each  operator  q  oi  Q,\ei  u  q  a  x  b  if  k{q)  =  a  and  i{q)  =  b.  This  is 
unique  since  no  other  translation  will  commute.  QED 

The  pullback  in  SIG  can  be  easily  computed  in  time  that  is  0{max{\A\,  |B|,  |P|)) 
(which  is  at  most  |>1|  x  |P|). 

Next  we  look  at  limits  in  the  category  of  theories,  THY.  The  key  idea  here  is  that 
a  theorem  in  the  pullback  corresponds  to  theorems  in  the  components  A  and  B  that 
have  the  same  abstract  form  as  some  theorem  in  C.  (clarify!) 

Proposition  3.  THY  has  all  finite  limits.  ® 

Proof:  We  need  to  show  that  THY  has  a  final  object  and  pullbacks.  The  final 
object  in  THY  is  essentially  the  final  object  of  SIG  Isig  together  with  the  set  of  all 
sentences  constructable  in  that  language.  Naturally  it  is  inconsistent. 

To  show  universality,  simply  note  that  there  is  a  unique  arrow  from  the  signature 
of  an  arbitrary  theory  T  — )■  Ithy  and  that  it  translates  each  source  theorem  to  the 
unique  theorem  in  Ithy  that  has  the  same  abstract  form. 

The  pullback  of  theories  extends  the  pullback  on  the  underlying  signatures.  The 
fiber  over  a  theorem  t  of  C*  is  essentially  g~^{t)  x  Each  theorem  in  the  fiber 

corresponds  to  a  pair  {tA,tB)  and  can  be  represented  in  the  language  of  P  as  follows^. 
First,  note  that  t,  and  ts  have  the  same  abstract  form  because  of  the  context- 
free  way  that  the  underlying  signature  morphisms  translate  expressions.  We  define  a 
function  that  recursively  translates  a  pair  of  expressions  into  the  language  of  P: 

Translate ( V(a;  :  sa)Pa{x),'^{x  :  sb)Pb{x))  =V(x  :  xs^) Translate (P^ (a:), Pb(2:)) 

Translate^  PaAQa,  PbAQb)  =  Translate (P4(x),  PB(a:))A'Ii:anslate((5A(2^)j<5n(2;)) 

®  Should  the  pullback  in  THY  include  all  defined  ops?  What  if  the  pullback  in  SIG  is  empty,  yet  the  pullback 
in  THY  could  Have  lots  of  paired  theorems  if  the  right  ops  were  there... 
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Translate(  fA{aA),  fB{aB))  =  Ia^  /b  (Translate  (a^,  ob)) 

Translate(  ca,  cb)  —  ca  x  cb  (for  constants  ca  and  cb) 
and  so  on.  The  existence  of  the  sort  sa  x  sb  in  P  is  guaranteed  by  the  context- 
free  translation  of  expressions  by  signature  morphisms  and  by  the  assumption  that 
9{tA)  =  t  =  h{tB).  Similar  guarantees  apply  to  /a  x  /b,  ca  x  cb,  etc. 

We  need  to  show  that  this  object  is  a  theory;  i.e.  it  is  closed  under  entailment  (or 
logical  consequence).  A  sketch:^  Consider  a  set  of  theorems  of  P  that  are  formed  as 
defined  above,  call  them  {tp^}e=i,...,n-  By  construction  we  have 


tPe 


tCe 


for  e  =  1,  ...,n.  Suppose  that  we  can  infer  tp  from  {tp^}e=i,...,n  via  rule  R.  Under  the 
mild  assumption  that  R  acts  on  the  syntactic  form  of  theorems,  then  the  analogous 
inferences  will  made  in  A,  B,  and  C,  so  we  have 


tp 


tc 


i.e.  inductively  we  have  Ia-,  tp,  and  tc  as  theorems  of  A,  B,  and  C  respectively,  so  tp 
is  a  theorem  of  P  by  construction.  Consequently,  P  is  closed  under  inference  and  it 
forms  a  theory. 

We  also  need  to  show  that  i  and  j  are  theory  morphisms  and  that  the  square 
commutes.  The  projection  morphisms  i  and  j  simply  unpack  the  theorem  x  tp  into 
tA  and  tp  respectively  (from  whence  it  was  formed).  The  diagrams  above  indicate  the 
essential  reason  for  the  commuting  of  the  diagram. 

To  show  universality,  define  the  universal  arrow  u  :  Q  — >  P  as  we  did  in  SIG. 
We  must  show  that  it  translates  theorems  to  theorems  and  that  it  factors  k  and  £. 
Suppose  that 

^  We  do  induction  on  the  proof  structure  of  an  arbitrary  theorem  in  P. 
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then 


shows  the  action  of  u  and  the  factoring  of  k  and  £  with  respect  to  theorems.  QED 

Corollary  1.  The  product  of  two  theories  exists  and  is  comprised  of  theorems 
that  have  the  same  abstract  form  in  both  A  and  B  (clarify!). 

Next  we  look  at  limits  in  the  category  of  specs,  SPEC  Unfortunately  these  do  not 
in  general  exist.  For  example,  The  final  object  in  SPEC,  \spec^  must  be  a  finite  (or 
recursive)  presentation  of  Ithy-  One  can  imagine  some  meta-machinery  for  presenting 
this  finitely,  such  as  an  algorithm  for  enumerating  it,  but  this  is  not  a  straightforward 
presentation  format. 

A  more  difificult  question  is  the  existence  of  a  finite/recursive  axiomatization  of 
a  pullback  theory.  If  we  just  consider  the  pullback  on  the  axioms  of  A  and  B,  there 
may  be  none  with  the  same  arity-structure,  so  the  fiber  product  of  theorems  would 
be  empty  (which  is  not  a  sufficient  axiomatization!). 

A  special  case:  if  we  consider  the  product  of  a  spec  with  itself  (e.g.  the  product  of 
group  theory),  then  we  do  get  an  axiomatization  of  the  product,  which  is  isomorphic 
to  the  spec  itself  (so  the  product  of  groups  is  a  group!). 

Nevertheless,  pullbacks  do  exist  for  a  wide  subcategory  of  SPEC.  Monies  in  SPEC 
correspond  exactly  to  morphisms  in  which  the  underlying  signature  morphism  is  in¬ 
jective  and  axioms  translate  to  axioms  (?).  Monies  in  SPEC  include  identity  arrows, 
tranlate  arrows,  definitional  extensions,  c-def  arrows,  imports/inclusions/extensions, 
conservative  arrows,  and  compositions  of  these. 

Proposition  4.  Pullbacks  of  monies  exist  in  SPEC. 
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Proof:  The  key  idea  is  that  the  fiber  over  a  symbol  or  axiom  is  a  singleton  set. 
We  show  this  result  for  simple  inclusions,  but  the  generalization  to  arbitrary  mon¬ 
ies  is  straightforward.  Suppose  that  g  and  h  are  inclusions,  then  symbols{P)  = 
symbols{A)  n  symbols{B)  and  axioms{P)  =  axioms{A)  fl  axioms{B).  Note  that 
each  symbol  in  the  axioms  of  P  must  be  symbol  in  both  A  and  B,  so  the  presentation 
of  P  is  closed®.  Next  note  that  P  is  included  in  A  and  in  5,  so  we  have  a  cone. 

To  show  universality,  let  Q  be  another  cone.  Since  Q  is  included  in 

A  and  in  B,  it  must  be  included  in  An  B  which  is  P.  Such  an  inclusion  is  unique. 
QED 

Proposition  5.  Pullbacks  (when  they  exist)  preserve  identity  arrows,  translate 
arrows,  definitional  extensions,  c-def  arrows,  imports/inclusions/extensions,  conser¬ 
vative  arrows,  and  monies,  and  p-spec  arrows. 

5.2.  Interpretations 

In  the  1970’s,  Goguen  and  Burstall  discovered  the  use  of  pushouts  to  instantiate 
parametrized  specifications.  In  Speeware,  we  tried  to  generalize  this  idea  and  use 
pushouts  to  instantiate  parametrized  interpretations]  however,  it  didn’t  seem  to  work. 
We  named  this  problem  “triv-to-triv-via-subsort” ,  after  one  of  the  proposed  but  un¬ 
satisfactory  solutions. 

This  section  proposes  to  solve  this  problem  by  recasting  interpretation  morphisms 
as  squares  of  interpretations  rather  than  triples  of  morphisms. 

As  an  example,  let’s  try  to  instantiate  List-to-List  (the  identity)  with  Odd-to-Nat. 
We  expect  to  obtain  an  interpretation  from  List[Odd]  to  List[Nat].  For  reference, 
here’s  the  mediator  of  Odd-to-Nat: 

spec  Odd-as-Nat  is 
import  Empty 
sort  Odd  =  Nat  I  odd? 
end-spec 

We  try  to  take  pushouts  according  to  this  diagram: 

List  - Triv - ^  Odd 
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However,  we  cannot  define  a  suitable  morphism  from  Triv  to  Odd-as-Nat.  If  we  map 
E  to  Odd  then  the  upper  square  commutes.  If  we  map  E  to  Nat  then  the  lower  square 
commutes.  We  can’t  have  both. 

To  fix  the  problem,  we  tried  replacing  Triv-to-Triv  with  Triv-to-Triv-via-Subsort: 

spec  TRIV-as-TRIV-via-SUBSORT  is 
sorts  E-dom,  E-cod 
sort  E-dom  =  E-cod  I  p? 
op  p?  :  E-cod  ->  Boolean 
end-spec 

We  can  find  an  interpretation  morphism  from  Triv-to-Triv-via-Subsort  to  Odd-to-Nat, 
but  not  to  List-to-List: 


List  ' . Triv-as-Triv-via-Subsort  ■ 

{  t 


Odd-as-Nat 

i 

T 

— Empty 


So  this  idea  pushes  the  problem  around  but  doesn’t  solve  it. 

In  the  example  above,  we  tried  to  construct  an  interpretation  from  the  List[Odd] 
to  List[Nat].  Although  this  seems  reasonable,  it  is  not.  To  see  why,  let’s  try  a  similar 
example  that  makes  the  error  more  apparent.  Instead  of  Odd-to-Nat,  let’s  try  Pair- 
to-Nat,  which  refines  an  abstract  sort  Pair  to  a  pair  of  naturals: 

spec  Pair-as-Nat  is 
import  Empty 
sort  Pair  =  Nat,  Nat 
end-spec 

If  the  instantiation  were  to  succeed,  we  would  obtain  a  refinement  from  List  [Pair]  to 
List  [Nat].  What  would  this  refinement  look  like?  We  could  keep  two  lists  of  naturals, 
one  for  the  left  component,  one  for  the  right.  Such  a  construction  depends  on  an 
injection: 

List  {Pair  (A))  — )•  Pair  {List  (A)) 

However,  this  is  not  what  we  want.  We  want  to  refine  List[Pair]  to  List[Nat  x  Nat], 
not  List  [Nat]. 
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It  turns  out  that  we  can  obtain  the  desired  result  by  replacing  interpretation 
morphisms  with  commuting  squares  of  interpretations.  Thus: 


A 


C 


A 


C 


I  I 

B  D  B=^D 


Then  there  is  no  problem  constructing  the  appropriate  diagram: 


List  < - Triv  >  Pair 

V  V  ^ 

List  <^=  Triv  Empty 

The  interpretation  from  Pair  to  Empty  constructs  pairs  of  naturals,  as  does  the  inter¬ 
pretation  from  Triv  to  Empty;  thus,  the  right  square  commutes.  The  upper  pushout 
constructs  List[Pair]  and  the  lower  one  constructs  List[Nat  x  Nat],  as  desired. 


5.3.  Slicing 

The  term  “theory  slicing”  refers  to  two  different  problems: 

—  To  factor  specifications  into  pieces  that  are  likely  to  be  reusable. 

—  To  eliminate  operators  and  axioms  that  not  needed  for  a  given  set  of  operator 
definitions. 

We  only  discuss  the  second  kind  here;  its  primary  use  is  in  code  generation  to  minimize 
the  size  of  the  target  code. 

We  structure  the  relationship  between  sets  of  operations  and  axioms  using  Galois 
connections. 

A  partial  order  is  a  set  A  with  a  relation  <  that  is  reflexive,  transitive,  and 
antisymmetric.  Antisymmetric  means  that  a  <  b  and  b  <  a  implies  a  =  b. 

A  Galois  connection  or  Galois  pair  {F,  G)  is  a  pair  of  monotone  functions 

F:A^B 

G-.B^A 

between  partial  orders  A  and  B  such  that  a  <  GFa  and  FGb  <  b. 

An  isomorphism  is  a  Galois  connection  in  which  a  =  GFa  and  FGb  =  b.  Unlike 
an  isomorphism,  a  Galois  connection  is  asymmetric:  (G,  F)  may  not  be  Galois  even 
if  (F,  G)  is.  The  categorical  concept  of  adjunction  further  generalizes  a  Galois  pair, 
but  we  don’t  need  adjunctions  here. 
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Given  a  spec  S,  let  Axiom  be  the  set  of  all  axioms  in  S  and  let  Op  be  the  set  of  all 
operators.  We  define  Axioms  =  V  Axiom  and  Ops  =  V  Op.  That  is,  Axioms  and  Ops 
are  the  sets  of  all  subsets  of  axioms  and  operations.  That  is,  an  element  of  Axioms  is 
a  set  of  axioms.  The  sets  Axioms  and  Ops  are  preorders,  ordered  by  set  containment. 

We  can  define  functions 


F  :  Axioms  — >■  Ops 
G  :  Ops  — >■  Axioms 

so  that  F  maps  a  set  of  axioms  to  the  set  of  operations  used  by  the  axioms,  and 
G  maps  a  set  of  operators  to  the  set  of  axioms  that  use  at  most  these  operations.  F 
is  monotone  because  more  axioms  use  more  operators,  and  G  is  monotone  because 
more  operations  allow  us  to  state  more  axioms. 

The  function  GF  takes  a  set  of  axioms  a  to  the  (larger)  set  of  axioms  stateable 
using  the  operators  of  a.  The  function  FG  takes  a  set  of  operators  b  to  the  (smaller) 
set  of  operators  that  actually  occur  in  the  axioms  stateable  using  b.  Thus  {F,  G)  is  a 
Galois  pair. 

An  equivalent  formulation  of  Galois  pair  is  that  for  all  a  and  6, 


Fa  <  b  a  <  Gb 

and  it  is  worth  checking  that  this  condition  also  holds  in  our  example. 

For  any  Galois  connection,  we  can  show  that  the  composites  FG  and  GF  are 
closure  operators,  that  is,  that  they  are  idempotent,  that  is,  that  FGFG  =  FG  and 
GFGF  =  GF.  Thus,  we  only  need  to  apply  these  operators  once;  applying  them 
further  has  no  eflFect.  Here  is  the  idempotence  proof  for  GF: 


a  ^  GFa 

GFa  <  GFGFa 
FGb  <  b 
FGFa  <  Fa 
GFGFa  <  GFa 
GFGFa  =  GFa 


assumption 
monotonicity  of  GF 
assumption 

substitution  of  Fa  for  b 
monotonicity  of  G 
antisymmetry  of  < 


A  partial  order  with  complement  is  a  partial  order  A  with  an  operation  ;  A  — >•  A 
such  that 


(a'^Y  =  a  and 

a  <  b  ¥  <  a^ 

Given  a  monotone  function  F  :  A  ^  B,  we  can  define  F^^  :  A  B  hy  F'^a  = 
F{a'^Y.  Then,  if  {F,  G)  is  a  Galois  pair,  so  is  (F*^,  G^): 
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G^b  <  a 

G{¥Y  <  a  definition  of  G^^ 

— »  0*=  <  G{b^)  complement  reverses  order 

■4r—^  F{a‘^)  <  b^  F  and  G  are  Galois 

<— T>  {}fY  <  F{a^Y  complement  reverses  order 

b  <  F{a^Y  complement  is  an  involution 

b  <  F^^a  definition  of  F'^ 

In  our  example,  F'^a  is  the  set  of  operations  that  don’t  occur  outside  a,  and  G^b 
is  the  set  of  axioms  that  use  some  operation  from  b.  F^  is  hard  to  grasp,  but  G^  is 
quite  natural. 

The  map  G^F'^  takes  a  set  of  axioms  a  to  the  subset  obtained  by  throwing  out 
axioms  all  of  whose  operations  occur  outside  a.  The  map  F^G^  takes  a  set  of  operations 
b  and  enlarges  it  by  examining  the  axioms  that  don’t  touch  b  and  adding  the  other 
operations  they  don’t  use.  As  before,  both  these  maps  are  idempotent. 

There  is  another  pair  of  Galois  connections  available  to  us.  For  the  moment,  we 
define  a  specification  to  be  a  pair  (O,  A)  of  operations  and  axioms  such  that  ops  A  C  O. 
Then  the  subspecifications  of  a  spec  S  form  a  partial  order  with  complement  VS  under 
pairwise  containment. 

We  define  four  monotone  functions  from  V{S)  to  itself: 

Ms  =  remove  all  operations  not  used  in  axioms 
Ns  =  add  all  operations  of  S'  to  s 
Is  =  remove  all  axioms  of  s 

Js  =  add  all  axioms  stateable  using  operations  of  s 
Then  (M,  N)  and  {I,  J)  are  both  Galois. 
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MISSION 

OF 

AFRL/INFORMATION DIRECTORATE  (IF) 


The  advancement  and  application  of  information  systems  science  and 
technology  for  aerospace  command  and  control  and  its  transition  to  air, 
space,  and  ground  systems  to  meet  customer  needs  in  the  areas  of  Global 
Awareness,  Dynamic  Planning  and  Execution,  and  Global  Information 
Exchange  is  the  focus  of  this  AFRL  organization.  The  directorate’s  areas 
of  investigation  include  a  broad  spectrum  of  information  and  fusion, 
communication,  collaborative  environment  and  modeling  and  simulation, 
defensive  information  warfare,  and  intelligent  information  systems 


technologies. 


